2023-09-28 10:51:51,221 INFO [train.py:1107] (2/4) Training started 2023-09-28 10:51:51,222 INFO [train.py:1117] (2/4) Device: cuda:2 2023-09-28 10:51:51,227 INFO [train.py:1129] (2/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.16.0.dev+git.1db4d97a.clean', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '09ada8fb-dirty', 'icefall-git-date': 'Thu Sep 28 10:47:39 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.16.0.dev0+git.1db4d97a.clean-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-6-0423201309-7c68fd68fb-6cszs', 'IP address': '10.177.28.83'}, 'world_size': 4, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 30, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'ctc_loss_scale': 0.2, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_tal_csasr': True, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'blank_id': 0, 'vocab_size': 2000} 2023-09-28 10:51:51,228 INFO [train.py:1131] (2/4) About to create model 2023-09-28 10:51:52,073 INFO [train.py:1135] (2/4) Number of model parameters: 68625511 2023-09-28 10:51:58,521 INFO [train.py:1150] (2/4) Using DDP 2023-09-28 10:51:59,174 INFO [multi_dataset.py:39] (2/4) About to get multidataset train cuts 2023-09-28 10:51:59,174 INFO [multi_dataset.py:42] (2/4) Loading Aishell-2 in lazy mode 2023-09-28 10:51:59,249 INFO [multi_dataset.py:49] (2/4) Loading TAL-CSASR in lazy mode 2023-09-28 10:51:59,268 INFO [multi_dataset.py:142] (2/4) About to get train-clean-100 cuts 2023-09-28 10:51:59,294 INFO [multi_dataset.py:149] (2/4) About to get train-clean-360 cuts 2023-09-28 10:51:59,300 INFO [multi_dataset.py:156] (2/4) About to get train-other-500 cuts 2023-09-28 10:52:14,751 INFO [asr_datamodule.py:218] (2/4) Enable MUSAN 2023-09-28 10:52:14,751 INFO [asr_datamodule.py:219] (2/4) About to get Musan cuts 2023-09-28 10:52:17,894 INFO [asr_datamodule.py:243] (2/4) Enable SpecAugment 2023-09-28 10:52:17,894 INFO [asr_datamodule.py:244] (2/4) Time warp factor: 80 2023-09-28 10:52:17,895 INFO [asr_datamodule.py:254] (2/4) Num frame mask: 10 2023-09-28 10:52:17,895 INFO [asr_datamodule.py:267] (2/4) About to create train dataset 2023-09-28 10:52:17,895 INFO [asr_datamodule.py:294] (2/4) Using DynamicBucketingSampler. 2023-09-28 10:52:17,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:18,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:18,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:18,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:18,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:18,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:18,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:18,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:18,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:19,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:19,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:19,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:19,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:19,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:19,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:20,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:20,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:20,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:20,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:20,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:20,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:21,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:21,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:21,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:22,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:22,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:22,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:22,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:22,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:22,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:22,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:23,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:23,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:23,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:23,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:23,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:23,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:23,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:23,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:24,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:24,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:24,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:25,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:25,267 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:25,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:25,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:25,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:25,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:25,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:25,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:26,074 INFO [asr_datamodule.py:309] (2/4) About to create train dataloader 2023-09-28 10:52:26,075 INFO [multi_dataset.py:88] (2/4) About to get multidataset dev cuts 2023-09-28 10:52:26,075 INFO [multi_dataset.py:91] (2/4) Loading Aishell-2 DEV set in lazy mode 2023-09-28 10:52:26,109 INFO [multi_dataset.py:163] (2/4) About to get dev-clean cuts 2023-09-28 10:52:26,127 INFO [multi_dataset.py:170] (2/4) About to get dev-other cuts 2023-09-28 10:52:26,172 INFO [asr_datamodule.py:340] (2/4) About to create dev dataset 2023-09-28 10:52:26,952 INFO [asr_datamodule.py:357] (2/4) About to create dev dataloader 2023-09-28 10:52:26,952 INFO [train.py:1351] (2/4) Sanity check -- see if any of the batches in epoch 1 would cause OOM. 2023-09-28 10:52:26,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:26,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:27,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:27,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:27,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:27,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:27,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:27,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:27,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:28,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:28,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:28,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:28,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:29,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:29,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:29,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:29,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:29,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:29,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:30,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:30,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:30,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:30,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:30,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:30,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:30,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:30,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:30,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:30,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:32,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:32,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:32,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:32,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:32,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:32,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:33,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:33,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:33,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:33,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:33,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:34,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:34,379 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:34,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:34,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:34,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:34,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:34,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:34,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:35,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:35,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:35,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:35,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:35,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:35,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:36,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:36,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:36,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:36,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:36,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:37,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:37,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:37,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:37,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:37,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:37,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:37,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:37,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:38,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:38,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:38,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:38,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:39,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:39,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:39,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:39,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:39,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:39,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:39,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:39,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:40,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:40,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:41,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:41,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:41,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:41,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:41,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:41,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:41,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:41,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:42,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:42,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:42,661 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:42,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:42,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:42,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:43,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:43,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:43,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:44,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:45,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 10:52:45,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:45,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:52:46,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:46,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:52:46,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:46,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 10:52:46,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 10:52:46,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:46,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:47,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:52:47,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:47,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 10:52:47,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:48,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:52:48,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:49,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 10:52:49,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:52:49,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:52:50,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:51,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:51,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:52,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 10:52:52,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 10:52:52,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:52,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:52,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:52,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:53,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 10:52:53,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:53,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:53,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:52:54,336 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 10:52:54,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:52:54,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:55,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:55,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 10:52:55,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:52:55,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:55,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:55,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:56,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:57,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 10:52:57,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:57,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:52:58,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 10:52:58,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 10:52:58,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:52:58,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:58,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:58,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:59,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:52:59,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:52:59,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:00,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:00,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:00,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:53:00,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 10:53:00,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:53:00,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:00,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 10:53:00,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:01,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 10:53:01,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:01,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:02,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:02,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:02,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:02,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 10:53:02,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 10:53:02,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:02,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:03,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:03,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:03,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 10:53:04,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 10:53:04,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 10:53:04,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:04,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:04,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 10:53:04,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 10:53:04,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:04,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:05,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:05,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:05,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:05,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:06,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:06,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 10:53:06,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:06,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:53:06,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:06,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:06,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:07,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:07,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 10:53:07,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:53:07,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:07,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:07,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:08,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 10:53:08,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:08,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:08,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:53:08,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:53:09,143 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 10:53:09,178 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 10:53:09,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:09,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:09,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:53:10,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:10,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:11,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:11,682 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 10:53:11,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:53:12,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:12,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:12,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:13,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:13,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:13,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:14,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:14,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:14,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:14,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:14,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:14,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 10:53:14,535 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 10:53:14,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:14,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:14,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:14,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:14,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 10:53:14,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:14,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:53:14,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:15,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:15,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:15,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:15,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:15,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:16,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:16,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:16,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:16,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:16,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:16,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:17,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:18,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 10:53:18,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 10:53:18,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 10:53:18,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:18,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:18,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:18,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:18,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:18,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:19,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:19,269 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 10:53:19,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:20,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:20,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:53:20,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 10:53:21,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:21,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:21,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:21,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:21,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:21,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:21,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:22,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 10:53:22,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:22,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:22,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:22,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:23,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:23,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 10:53:23,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:53:23,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:24,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:24,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:24,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 10:53:25,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:25,044 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 10:53:25,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:25,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:25,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:26,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 10:53:26,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:26,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:26,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 10:53:26,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:53:27,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:27,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:27,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:27,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:27,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:29,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:29,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:30,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:53:30,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:30,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:53:30,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:30,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:31,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:31,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:31,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:31,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 10:53:31,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:31,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:32,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:32,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:33,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:33,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:53:34,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:34,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 10:53:35,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:35,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:35,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:35,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:35,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 10:53:35,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:35,520 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 10:53:35,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:35,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:53:36,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:36,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:36,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:36,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:36,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:37,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:39,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:39,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:39,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:53:40,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:53:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:53:40,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:53:40,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:40,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:40,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:53:40,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:41,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:41,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 10:53:41,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:41,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:41,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:41,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:41,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:41,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:42,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:53:42,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:42,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:53:42,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:43,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:53:43,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:43,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:44,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:45,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:45,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 10:53:45,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:45,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:46,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 10:53:46,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:46,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:46,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 10:53:46,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:46,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:47,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:47,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 10:53:47,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:47,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:53:47,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 10:53:47,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:48,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:53:48,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:53:48,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 10:53:49,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 10:53:49,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:49,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:49,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:49,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 10:53:49,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:50,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:50,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:51,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:51,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:53:51,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 10:53:51,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:52,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:52,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 10:53:52,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:53,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:53:53,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:53,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 10:53:53,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:53,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:53:54,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:54,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:54,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 10:53:54,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:53:54,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:54,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 10:53:54,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:55,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:55,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:55,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:55,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:56,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:56,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 10:53:56,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:57,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:58,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:58,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:58,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 10:53:58,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:58,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 10:53:59,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:59,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 10:53:59,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:59,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 10:53:59,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:00,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:00,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:00,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:00,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:00,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:00,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:00,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:00,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:00,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:01,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:01,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:54:01,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:01,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:02,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 10:54:02,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:03,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:03,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:03,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:03,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 10:54:03,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:03,777 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 10:54:03,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 10:54:03,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:05,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:05,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 10:54:05,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:05,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:54:05,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:05,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:05,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:06,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:06,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:06,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:54:07,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 10:54:07,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:07,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:07,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:07,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:07,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:08,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:08,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 10:54:08,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 10:54:08,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:08,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 10:54:08,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:09,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:09,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:09,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 10:54:09,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:09,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:09,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:09,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:09,742 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 10:54:09,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 10:54:10,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:10,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:10,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 10:54:11,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 10:54:11,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:12,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:12,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 10:54:13,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:54:13,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 10:54:13,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:14,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:14,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 10:54:14,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:14,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:54:15,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:15,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:15,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 10:54:15,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:54:15,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 10:54:16,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:54:16,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:16,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 10:54:16,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:16,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:16,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:17,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 10:54:17,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:17,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:17,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:17,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 10:54:17,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:54:17,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:54:17,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:54:19,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:19,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:19,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 10:54:20,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 10:54:20,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:20,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:21,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:21,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:21,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:21,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 10:54:22,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 10:54:22,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 10:54:22,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:22,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:22,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:54:22,840 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 10:54:22,869 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 10:54:22,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:23,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:23,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:54:23,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:54:23,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:23,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:54:23,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 10:54:24,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:24,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:54:24,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:54:24,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 10:54:25,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:54:25,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 10:54:26,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 10:54:26,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:26,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:26,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:27,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:27,276 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 10:54:27,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:27,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:54:27,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:27,978 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 10:54:28,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 10:54:28,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:28,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:54:29,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 10:54:29,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:54:29,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:29,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:29,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:30,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:31,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:54:31,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:54:31,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:31,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 10:54:31,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:54:31,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:32,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:54:32,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:32,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:32,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 10:54:32,864 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 10:54:33,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:33,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:33,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:33,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:33,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:54:33,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 10:54:34,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:54:34,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:34,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:35,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:36,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:36,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 10:54:36,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:36,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:36,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 10:54:36,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:37,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:37,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 10:54:37,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 10:54:37,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:37,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 10:54:38,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:38,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:38,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:38,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:38,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:39,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:54:39,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:39,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 10:54:39,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:54:40,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:40,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:40,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:40,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:41,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 10:54:41,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 10:54:41,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:42,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:42,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:42,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:42,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:42,826 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 10:54:42,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:43,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:54:43,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:43,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:43,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:54:43,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:43,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 10:54:44,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 10:54:44,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:44,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:44,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:44,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:44,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:45,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:45,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:45,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 10:54:46,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:54:46,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:46,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:46,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:46,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:54:46,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:54:47,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 10:54:47,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 10:54:48,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:48,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:54:48,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:49,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:49,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:54:49,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 10:54:49,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:54:49,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:50,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:50,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 10:54:50,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:51,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 10:54:51,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:51,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:51,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:54:53,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:53,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:53,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:54,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:54:54,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:54,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:54,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:55,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 10:54:56,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:54:56,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:56,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 10:54:56,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:54:56,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 10:54:57,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:54:57,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:54:57,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:54:58,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:54:58,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:54:59,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:59,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:59,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 10:54:59,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:00,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:55:00,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:00,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:01,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 10:55:01,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:01,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:02,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:02,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:55:02,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:02,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:02,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:02,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:02,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:55:03,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:55:03,284 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 10:55:03,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:03,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:03,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:03,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:03,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:04,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:55:04,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 10:55:04,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:55:04,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:55:04,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:55:04,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:04,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:55:04,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 10:55:05,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 10:55:05,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:05,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:05,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:05,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:06,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:06,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:06,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:07,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:07,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:07,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:55:07,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:08,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:08,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:08,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:08,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:09,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 10:55:09,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 10:55:09,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 10:55:09,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:09,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:10,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 10:55:10,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:10,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:11,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:11,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:55:11,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:11,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:12,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:55:12,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:13,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 10:55:13,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 10:55:13,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:55:13,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:14,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:55:14,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:14,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 10:55:15,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:15,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:15,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 10:55:15,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:16,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:16,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:17,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:55:17,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 10:55:17,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 10:55:17,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 10:55:17,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:18,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:18,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:18,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:18,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 10:55:19,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 10:55:19,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 10:55:19,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 10:55:20,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 10:55:20,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 10:55:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:20,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 10:55:20,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:20,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:20,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:21,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:21,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:55:21,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:21,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:21,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:55:21,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:22,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:22,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:22,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 10:55:22,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:55:22,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:22,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:23,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:55:23,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 10:55:23,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:23,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 10:55:23,554 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 10:55:23,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 10:55:23,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:23,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:24,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:55:24,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:24,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:24,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:25,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:25,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:25,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 10:55:25,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:26,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 10:55:26,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:26,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:26,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 10:55:26,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:27,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:27,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:55:27,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:28,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:28,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 10:55:28,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:28,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:29,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:29,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:29,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:29,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:55:30,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:30,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:30,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:30,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:30,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:31,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:31,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:31,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:31,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:32,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 10:55:32,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:32,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:32,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:32,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:33,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 10:55:33,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:33,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 10:55:33,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:34,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:34,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:34,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:34,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:34,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:35,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:35,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 10:55:35,661 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 10:55:35,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 10:55:35,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:55:35,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:35,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:35,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:36,595 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 10:55:36,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 10:55:36,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:55:37,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:55:37,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:38,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:38,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 10:55:38,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:38,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 10:55:39,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:40,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:40,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 10:55:40,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:55:40,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:40,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 10:55:40,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:40,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:41,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:41,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:41,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:41,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 10:55:41,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 10:55:41,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 10:55:41,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:41,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:42,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:42,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:42,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:42,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:43,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:43,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 10:55:43,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 10:55:43,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:44,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 10:55:44,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 10:55:44,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 10:55:44,980 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 10:55:44,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:45,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:45,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:55:45,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:45,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:45,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 10:55:45,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:45,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:46,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:46,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:55:46,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:47,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:55:47,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 10:55:47,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:47,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:47,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:55:47,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:48,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:48,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:48,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:48,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:55:48,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:48,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:55:49,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:55:50,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:50,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 10:55:50,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:50,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:50,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 10:55:51,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:51,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:51,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 10:55:52,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:52,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 10:55:52,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:55:52,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:52,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:52,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:52,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:54,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:54,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:54,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:54,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:55,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 10:55:55,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:56,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:55:56,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:56,424 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 10:55:56,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 10:55:57,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:55:57,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:55:57,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:55:58,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:58,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:55:58,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 10:55:58,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:58,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 10:55:58,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:55:58,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:59,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:59,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:59,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 10:56:00,232 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 10:56:00,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:56:00,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 10:56:00,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:01,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 10:56:01,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:02,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:02,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:02,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:56:02,620 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 10:56:03,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:03,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:03,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:03,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:03,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 10:56:03,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:56:03,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:03,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 10:56:03,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:04,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:04,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:04,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:04,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 10:56:04,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:56:05,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:05,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:05,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:05,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:06,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 10:56:07,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:56:07,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:56:07,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:07,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:07,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:56:07,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 10:56:08,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:08,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:08,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:08,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 10:56:08,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:08,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:08,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 10:56:09,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:09,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:10,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:10,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 10:56:10,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 10:56:10,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:11,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 10:56:11,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:11,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:11,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 10:56:11,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 10:56:12,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:12,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:12,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:13,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 10:56:14,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 10:56:14,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 10:56:14,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:14,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 10:56:14,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 10:56:14,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 10:56:15,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:15,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:16,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:16,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:16,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:16,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:16,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 10:56:16,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:16,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:56:16,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:16,810 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 10:56:17,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 10:56:17,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 10:56:17,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 10:56:17,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:18,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:18,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:56:18,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:18,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:56:19,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 10:56:19,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:56:19,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 10:56:19,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 10:56:20,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:20,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:20,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:20,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:21,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:21,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:21,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:21,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:56:22,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:22,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:22,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:56:22,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:56:23,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:23,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:23,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:56:23,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:23,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 10:56:23,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:23,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 10:56:23,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:23,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 10:56:23,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:56:24,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:24,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:24,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 10:56:25,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 10:56:25,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:25,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 10:56:26,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 10:56:26,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:27,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:56:27,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:56:27,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:27,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:28,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:56:28,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 10:56:28,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 10:56:28,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 10:56:28,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:28,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:56:29,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 10:56:29,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:29,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:29,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:29,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:29,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:30,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:30,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 10:56:30,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:30,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 10:56:30,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 10:56:30,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:31,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:31,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:32,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:56:32,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:32,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:32,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 10:56:33,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:33,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:56:34,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:34,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:34,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 10:56:34,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:56:34,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:34,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:35,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:35,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:56:36,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:36,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 10:56:36,546 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 10:56:36,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:36,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:37,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:56:37,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:37,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 10:56:37,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:56:37,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:56:37,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:37,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:37,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 10:56:38,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:38,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 10:56:38,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:39,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:56:39,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 10:56:39,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:56:40,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:40,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:40,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:40,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 10:56:41,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:41,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:41,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 10:56:41,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:41,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 10:56:41,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:41,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:41,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:42,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:42,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:42,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:43,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:43,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 10:56:43,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:43,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 10:56:43,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:43,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:44,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 10:56:44,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:45,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:45,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:45,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 10:56:45,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:45,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:45,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 10:56:46,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:46,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:47,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:48,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:48,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 10:56:48,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:48,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:49,714 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 10:56:49,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:50,728 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 10:56:51,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:51,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:56:51,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:56:51,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:51,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:52,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:52,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:52,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:52,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:56:53,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:53,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:54,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:54,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:54,390 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 10:56:54,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 10:56:55,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:56:55,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:56:55,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:55,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:55,938 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 10:56:56,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:57,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:56:57,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:57,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 10:56:57,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:56:57,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 10:56:58,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 10:56:58,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:58,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:58,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:58,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:56:58,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:58,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:56:58,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:58,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 10:56:59,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:56:59,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:56:59,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:56:59,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:59,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:00,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:00,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:57:01,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 10:57:01,793 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 10:57:02,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:02,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:02,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:02,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:02,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 10:57:03,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:03,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:03,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 10:57:03,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:04,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:04,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:57:04,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:04,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:57:04,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:05,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:57:05,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 10:57:05,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:05,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:05,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:05,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:06,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:06,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:57:07,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 10:57:07,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:57:07,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:08,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 10:57:08,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:08,129 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 10:57:08,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:08,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:08,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:09,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:09,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:09,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 10:57:09,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 10:57:09,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 10:57:09,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:10,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 10:57:10,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:10,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:10,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:10,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 10:57:10,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:57:10,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:57:10,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:57:10,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:11,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 10:57:11,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:11,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:11,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:57:12,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:12,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:12,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 10:57:13,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:13,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:57:14,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:14,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:14,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:57:14,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 10:57:15,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:15,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:57:15,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 10:57:15,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:57:16,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:16,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:16,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:16,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:16,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:57:17,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:17,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 10:57:17,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:17,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:57:18,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 10:57:18,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:57:18,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:18,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:18,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 10:57:18,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:18,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 10:57:19,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:19,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:19,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:19,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 10:57:20,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 10:57:20,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 10:57:21,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:21,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 10:57:21,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:22,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 10:57:23,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:23,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:23,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:24,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:24,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:24,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:24,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:24,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 10:57:25,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:25,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:25,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 10:57:25,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:25,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:25,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 10:57:26,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 10:57:26,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 10:57:26,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:26,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 10:57:28,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:29,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:29,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:29,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 10:57:30,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:30,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 10:57:30,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:30,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:30,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:31,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 10:57:31,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:57:31,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 10:57:32,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 10:57:32,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 10:57:33,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:34,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:34,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:34,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 10:57:35,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 10:57:36,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:36,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:36,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:36,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:57:37,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:37,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:57:38,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:38,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:39,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 10:57:39,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:39,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:39,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:39,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:39,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:57:39,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:39,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:39,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 10:57:40,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:41,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:41,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:57:42,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 10:57:42,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:57:42,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:43,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:57:43,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:43,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:43,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:57:44,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:44,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:44,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:44,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:44,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:57:45,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:45,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 10:57:45,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:45,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 10:57:45,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:45,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:45,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 10:57:45,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:45,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:57:46,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 10:57:46,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:46,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:57:46,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:46,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:47,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:57:47,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:48,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:48,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:48,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:48,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:57:48,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:48,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:48,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 10:57:49,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:49,684 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 10:57:49,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:50,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:57:50,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:50,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 10:57:50,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:51,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 10:57:51,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 10:57:51,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:51,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:51,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:52,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 10:57:52,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 10:57:52,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 10:57:52,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:52,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:57:54,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 10:57:54,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:54,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:57:54,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:55,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:55,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:55,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 10:57:55,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:57:55,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:57:55,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:55,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:55,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:56,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:56,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:56,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 10:57:56,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:57:56,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:57,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:57,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 10:57:57,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 10:57:58,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:58,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 10:57:58,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:57:58,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:57:58,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:59,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:59,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 10:57:59,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:59,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:59,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 10:57:59,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:00,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:58:00,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 10:58:01,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:58:01,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:58:02,218 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 10:58:02,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:02,317 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 10:58:02,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:02,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:02,792 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 10:58:02,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:58:03,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 10:58:03,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:03,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:03,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:04,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:04,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:04,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:58:04,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 10:58:04,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 10:58:04,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:04,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 10:58:04,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 10:58:05,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:05,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:05,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:05,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:05,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:05,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:05,909 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 10:58:05,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:06,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:58:06,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:58:06,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:58:06,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 10:58:06,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:06,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 10:58:06,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 10:58:06,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 10:58:06,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:07,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:08,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:08,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 10:58:08,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 10:58:09,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:09,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:10,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:58:10,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:10,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 10:58:10,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:58:11,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:11,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:11,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:11,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:11,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 10:58:11,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:12,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:12,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:12,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 10:58:12,249 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 10:58:12,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:13,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 10:58:13,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:13,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:14,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 10:58:14,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:15,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:15,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:15,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:15,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:58:15,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:15,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 10:58:15,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 10:58:16,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 10:58:16,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:16,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 10:58:16,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:17,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:17,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:17,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 10:58:18,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:18,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 10:58:18,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:18,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 10:58:19,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 10:58:20,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:20,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 10:58:20,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:20,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:20,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:20,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 10:58:21,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 10:58:22,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:22,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:22,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:22,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:58:22,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:58:22,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:58:23,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:23,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:23,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:24,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 10:58:24,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:24,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 10:58:24,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:25,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:25,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:25,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 10:58:25,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 10:58:25,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 10:58:25,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 10:58:25,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:26,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:26,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:26,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:58:26,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:26,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 10:58:26,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:26,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:26,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:26,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:58:27,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 10:58:27,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 10:58:28,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:58:28,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:58:29,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 10:58:29,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:30,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 10:58:30,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:30,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:30,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:30,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:31,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:31,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:31,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:31,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:31,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:31,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:31,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:31,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:32,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:32,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 10:58:32,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:32,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 10:58:32,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 10:58:32,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 10:58:32,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:32,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:33,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:33,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 10:58:33,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:33,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:33,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:34,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 10:58:35,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:35,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:58:35,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 10:58:35,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:35,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:35,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:35,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:35,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:35,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 10:58:36,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:58:37,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:37,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:38,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:58:38,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:38,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:38,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:38,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 10:58:38,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:39,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:39,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:58:39,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:58:39,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 10:58:39,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 10:58:39,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:40,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 10:58:40,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:41,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:42,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:42,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:42,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:58:42,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 10:58:42,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:42,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:42,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 10:58:42,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:42,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:42,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:42,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:43,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:43,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:43,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:43,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:58:43,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:44,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:44,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 10:58:44,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:44,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:44,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 10:58:45,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:45,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:46,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:58:46,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 10:58:46,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:46,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:46,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:47,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 10:58:47,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:48,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 10:58:48,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:48,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:48,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:58:49,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 10:58:49,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:49,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 10:58:50,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:58:50,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:50,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:51,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:51,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:51,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:51,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:52,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:58:52,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:52,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 10:58:52,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:58:52,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 10:58:52,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:53,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:53,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:58:53,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:58:53,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:58:53,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:54,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:55,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:55,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:55,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:55,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 10:58:55,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:55,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:58:55,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:55,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:58:55,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:58:56,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:58:56,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:56,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:56,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:58:57,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:57,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 10:58:57,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:57,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:58,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:58,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:58,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:58:58,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:58,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 10:58:58,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:59,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:59,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 10:59:00,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 10:59:00,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 10:59:00,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:00,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:00,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:00,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:59:01,881 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 10:59:02,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:02,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:02,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 10:59:02,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 10:59:02,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:59:02,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:02,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:59:03,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 10:59:04,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:04,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 10:59:04,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:04,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:04,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:04,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 10:59:05,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:05,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:05,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 10:59:05,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:05,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:05,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:05,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:05,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:06,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:59:06,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:06,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:06,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:06,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:07,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:07,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 10:59:08,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 10:59:09,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 10:59:09,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:09,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 10:59:09,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:59:10,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:10,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 10:59:11,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:11,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:11,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 10:59:11,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:11,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:12,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:12,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:12,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:13,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:59:13,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:13,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:59:13,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:13,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:13,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:59:14,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 10:59:14,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:15,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:15,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:59:15,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 10:59:15,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 10:59:15,549 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 10:59:15,675 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 10:59:15,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:15,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:15,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:15,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:16,136 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 10:59:16,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:16,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:16,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:59:16,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:16,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:16,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 10:59:16,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:17,013 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 10:59:17,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:59:17,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:17,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:18,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:18,229 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 10:59:18,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 10:59:18,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:18,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:18,651 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 10:59:18,714 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 10:59:19,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 10:59:19,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:19,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 10:59:20,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 10:59:21,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 10:59:22,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 10:59:22,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:22,402 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 10:59:22,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 10:59:22,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 10:59:22,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 10:59:22,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:23,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 10:59:23,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:59:23,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:23,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 10:59:24,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:24,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 10:59:24,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:25,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:59:25,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:25,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:25,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:25,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:25,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 10:59:25,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:59:26,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:26,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:26,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:26,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:26,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:26,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:27,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:27,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:59:27,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:27,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:27,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 10:59:27,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:59:28,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:28,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:29,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:29,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:29,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:29,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:29,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:29,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:59:29,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:29,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:30,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:30,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:30,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:30,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:59:30,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 10:59:30,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:30,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:31,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:31,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:31,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:32,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:59:32,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:32,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:32,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 10:59:32,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:33,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:33,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:33,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:34,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:34,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:34,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:35,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:36,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:59:36,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:36,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 10:59:36,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:36,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:36,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 10:59:36,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:37,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:37,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:37,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:37,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:37,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:38,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 10:59:38,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:38,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:39,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 10:59:39,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:39,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:39,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:39,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 10:59:39,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:40,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:40,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:40,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 10:59:40,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:40,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 10:59:40,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:41,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:41,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:59:41,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:41,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:41,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:41,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 10:59:42,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 10:59:42,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:42,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:43,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:43,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:43,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:43,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:43,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:43,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:43,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:43,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:43,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:44,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:44,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 10:59:45,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:59:45,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:45,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:45,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:46,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:46,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:46,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:46,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:59:46,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:59:46,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:46,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:47,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:47,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:47,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:48,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:59:48,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:49,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:49,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 10:59:49,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:49,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:49,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:59:50,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:50,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:50,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 10:59:51,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:51,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 10:59:51,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:51,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:52,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:52,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:59:52,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:52,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:52,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:52,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:53,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:53,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:59:53,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:59:53,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:54,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:55,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:55,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 10:59:56,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:56,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:59:56,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:57,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 10:59:57,516 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 10:59:57,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:57,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:57,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:59:57,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:57,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 10:59:57,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 10:59:58,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:58,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:59:58,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:58,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:59,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:59,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 10:59:59,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:59,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 10:59:59,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 10:59:59,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:59,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:59:59,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 10:59:59,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:00:00,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 11:00:00,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:00:00,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:00,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:01,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:01,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 11:00:01,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:01,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:00:02,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 11:00:02,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:02,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 11:00:02,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 11:00:02,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 11:00:02,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:03,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:03,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:03,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:03,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:03,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:04,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 11:00:04,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:04,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:04,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:04,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 11:00:04,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 11:00:04,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 11:00:05,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:00:05,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:05,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 11:00:06,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:06,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:00:06,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:06,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:06,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:00:06,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:06,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:06,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:07,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:07,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:07,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 11:00:07,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 11:00:07,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:07,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:07,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:07,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:08,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:09,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:00:09,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:09,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:10,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:10,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:10,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:10,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:10,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:10,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:11,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:11,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 11:00:11,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:11,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:12,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:12,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:12,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:12,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:12,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:12,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:12,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:12,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 11:00:12,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:00:12,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:12,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:13,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:13,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:13,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:13,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:13,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:13,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 11:00:13,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:14,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:14,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:14,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:14,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:00:14,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:14,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:14,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 11:00:15,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 11:00:15,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:00:15,271 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 11:00:15,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:16,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:16,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 11:00:16,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:16,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 11:00:16,245 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 11:00:16,245 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 11:00:16,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 11:00:16,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:16,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:16,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:16,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:16,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:00:17,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:17,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:18,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:18,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 11:00:18,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:19,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:19,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:19,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:19,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:20,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:20,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:20,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 11:00:20,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 11:00:20,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:00:21,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 11:00:22,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:22,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:22,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:23,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:23,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 11:00:23,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:24,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:24,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:00:24,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:24,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:24,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:25,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:25,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 11:00:25,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:25,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 11:00:26,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:26,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:00:26,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:26,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:26,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:26,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:26,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:26,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:00:26,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:27,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:00:27,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:00:27,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:27,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:28,225 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 11:00:28,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:00:28,641 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 11:00:28,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:00:28,943 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 11:00:29,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:29,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:00:29,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:30,471 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 11:00:30,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:31,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:31,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:32,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:00:32,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:32,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:32,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:33,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 11:00:33,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:33,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:33,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 11:00:33,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:33,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:34,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:35,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:35,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:00:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:00:35,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 11:00:35,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:35,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:36,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:36,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:36,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:36,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:37,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:37,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:38,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:38,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:00:39,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:00:39,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:39,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:00:40,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:00:40,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:00:40,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 11:00:40,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:40,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:00:40,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 11:00:41,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:41,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:42,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:42,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:42,438 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 11:00:42,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:43,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:43,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:00:43,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:43,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:43,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 11:00:43,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:44,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:44,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:45,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:00:45,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:46,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:46,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:00:46,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:46,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:00:47,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:47,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:47,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:00:47,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:47,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 11:00:48,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:00:48,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:48,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:48,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:48,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:48,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:00:48,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:00:48,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 11:00:48,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:48,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:48,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 11:00:49,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:50,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:50,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:50,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:50,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:00:51,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:00:51,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:51,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:51,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 11:00:52,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:00:52,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 11:00:53,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 11:00:53,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:54,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:54,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:54,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:00:54,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:54,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 11:00:54,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:55,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 11:00:55,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:00:55,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:00:55,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:56,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:00:56,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 11:00:56,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:56,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:56,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:57,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:57,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 11:00:57,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:00:58,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:58,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:58,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 11:00:59,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:00:59,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 11:00:59,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:00:59,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 11:01:00,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 11:01:00,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:00,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:01:00,509 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 11:01:00,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 11:01:00,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 11:01:01,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:01,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:02,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:02,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:02,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 11:01:02,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 11:01:03,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:01:03,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:03,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 11:01:03,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:03,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:03,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 11:01:05,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:05,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 11:01:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:06,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 11:01:06,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:07,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:07,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:07,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 11:01:07,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:01:08,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:08,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:09,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:09,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:01:09,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:01:09,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:09,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:09,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:09,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:01:10,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:10,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:01:10,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 11:01:10,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 11:01:10,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:10,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:10,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 11:01:11,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 11:01:11,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 11:01:11,022 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 11:01:11,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 11:01:11,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:12,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:12,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:12,296 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 11:01:12,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:12,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:01:12,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:01:12,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:13,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:13,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:13,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 11:01:14,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:14,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:14,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:14,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:14,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:14,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 11:01:15,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:15,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:01:15,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:16,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:01:16,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:16,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:16,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:16,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 11:01:16,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:01:17,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:18,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:18,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:18,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:01:18,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:18,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:18,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 11:01:19,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:19,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:19,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:19,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:20,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:01:20,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 11:01:20,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:20,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:20,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 11:01:20,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:20,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:01:21,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:01:21,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:21,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:01:22,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 11:01:22,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:23,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:24,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:01:24,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:25,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:25,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 11:01:25,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:01:26,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:26,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:01:26,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:01:26,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 11:01:26,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:26,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:26,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 11:01:26,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:26,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 11:01:26,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:27,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:27,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:27,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:01:27,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 11:01:27,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:28,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:28,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:29,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:29,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:30,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:01:30,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 11:01:30,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:30,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:30,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:30,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:01:30,947 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 11:01:30,948 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 11:01:30,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 11:01:32,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:32,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 11:01:32,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 11:01:32,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:32,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 11:01:32,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 11:01:33,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:33,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:33,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:33,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:34,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 11:01:34,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:34,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 11:01:34,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:01:35,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:35,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:35,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:01:35,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:35,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:35,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:35,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:01:35,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 11:01:35,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:36,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:36,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 11:01:37,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:38,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:38,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:38,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:39,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:01:39,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:40,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:40,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:01:40,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:40,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:01:40,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:40,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:40,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:41,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:41,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 11:01:41,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:41,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:41,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:01:41,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:01:41,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:42,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:42,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:43,316 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 11:01:43,684 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 11:01:43,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:01:43,769 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 11:01:43,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 11:01:43,932 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 11:01:44,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:44,324 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 11:01:44,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 11:01:44,697 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 11:01:45,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:01:45,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 11:01:45,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 11:01:46,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:01:46,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 11:01:46,523 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 11:01:46,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 11:01:47,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:47,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:47,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:47,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 11:01:47,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:48,490 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 11:01:49,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:49,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:49,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 11:01:49,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:49,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:49,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 11:01:49,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:49,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:50,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:50,722 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 11:01:50,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:50,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:51,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:52,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:01:52,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 11:01:52,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:52,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:52,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:53,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 11:01:53,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:53,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:01:54,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 11:01:54,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:54,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:01:54,463 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 11:01:54,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:54,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:55,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:01:55,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:55,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:55,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 11:01:55,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:01:55,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:56,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 11:01:56,261 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 11:01:56,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:56,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 11:01:56,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:56,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 11:01:57,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:57,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:01:57,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:57,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:58,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 11:01:58,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 11:01:59,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:59,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 11:01:59,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:59,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:59,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:01:59,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:00,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:00,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:00,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:00,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:00,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:00,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:01,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:01,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:01,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:01,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:01,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:02:01,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:02,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:02:02,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:02,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 11:02:02,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:02,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:02,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:03,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:03,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:02:03,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:03,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:03,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 11:02:03,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:04,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:02:04,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:04,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:04,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:04,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:04,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:04,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:02:04,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:02:04,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 11:02:04,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:05,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:05,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:06,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:06,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:02:06,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 11:02:06,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:07,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:02:07,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:08,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:08,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:08,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:08,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:08,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:08,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:08,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:08,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:09,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:09,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:02:10,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:10,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:11,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:02:11,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:11,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:11,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:11,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:12,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:12,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:12,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:13,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:13,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:13,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:13,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 11:02:13,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:14,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:14,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 11:02:14,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 11:02:14,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:14,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:14,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:15,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:15,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:15,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:15,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:15,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:02:15,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:15,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:15,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 11:02:15,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:15,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:16,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 11:02:16,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:16,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:16,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:16,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:02:17,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:17,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:17,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:17,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:17,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:17,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:02:17,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:18,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:18,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:19,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:19,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:02:19,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:20,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:20,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:20,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:02:21,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:21,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:21,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 11:02:21,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:22,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 11:02:22,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:02:23,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:23,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 11:02:23,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:23,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:23,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 11:02:23,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:02:24,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:02:24,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:24,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:24,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 11:02:24,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:24,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:24,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:24,636 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 11:02:24,637 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 11:02:25,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:25,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:25,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:26,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:26,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 11:02:26,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:02:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 11:02:27,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:27,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:27,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:27,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:27,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:27,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:28,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:29,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:02:29,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:29,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:29,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:29,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:30,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:30,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 11:02:30,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:30,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:30,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:31,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:31,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:31,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:32,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:32,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:32,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:02:32,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:02:32,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:02:32,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:32,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 11:02:33,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:33,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:33,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:33,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 11:02:33,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:33,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:33,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:33,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 11:02:34,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:34,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:02:34,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:35,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:35,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:35,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:35,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:35,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:36,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:36,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 11:02:37,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 11:02:37,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:37,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 11:02:37,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:37,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 11:02:37,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 11:02:38,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:40,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:40,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:40,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:40,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:40,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:40,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:40,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:02:40,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 11:02:41,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:41,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:41,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:41,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:41,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:41,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:41,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:42,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:42,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:42,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:42,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:42,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:02:43,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:43,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 11:02:43,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 11:02:43,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:02:44,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 11:02:44,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:44,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:44,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:44,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:02:44,413 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 11:02:44,477 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 11:02:44,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:02:44,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:45,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:45,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:45,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:46,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 11:02:46,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:46,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 11:02:46,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 11:02:46,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:46,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:47,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:47,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:47,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:02:47,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:48,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:02:48,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 11:02:48,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:02:48,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:48,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 11:02:49,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 11:02:49,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:49,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 11:02:49,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:49,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:49,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:50,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:50,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:50,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:51,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:51,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 11:02:51,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 11:02:51,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:02:51,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:02:52,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 11:02:52,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:53,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:54,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:54,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 11:02:55,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:55,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 11:02:55,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:55,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:02:56,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:56,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 11:02:56,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:56,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:56,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:56,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:56,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 11:02:57,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 11:02:57,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:02:57,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:57,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:58,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:02:58,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:58,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:58,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:58,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:02:59,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:59,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:59,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:00,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 11:03:00,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 11:03:00,680 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 11:03:00,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:00,995 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 11:03:01,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 11:03:01,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:01,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:03:01,291 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 11:03:01,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:03:01,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 11:03:01,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:01,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:03:02,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:02,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:03:02,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:02,435 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 11:03:02,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:02,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 11:03:03,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:03,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:03,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 11:03:03,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:03,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 11:03:04,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:04,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:04,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:04,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:04,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:03:04,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:04,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:04,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:04,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:03:05,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:05,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:05,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:05,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 11:03:06,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:06,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:06,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:03:06,954 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 11:03:07,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 11:03:07,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:07,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:07,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 11:03:07,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:08,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:03:09,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:10,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 11:03:10,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:03:10,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:10,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:10,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:11,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:11,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 11:03:11,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 11:03:11,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:11,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:03:11,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:11,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:12,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:12,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:13,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:03:13,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:13,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:03:13,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:13,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 11:03:13,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:03:13,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:13,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:03:14,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:14,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:14,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:03:14,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 11:03:14,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:14,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 11:03:14,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:15,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 11:03:15,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:15,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:03:15,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 11:03:15,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 11:03:15,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:03:15,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:16,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:16,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:03:16,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:16,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:16,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 11:03:16,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:16,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:17,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:17,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:17,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 11:03:18,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 11:03:18,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 11:03:18,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:19,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:03:20,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:20,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:20,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:20,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:20,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:20,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:20,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:20,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:21,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:21,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:21,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:21,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 11:03:21,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:21,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:22,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:22,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:03:22,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:22,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:23,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:23,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:23,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:23,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:23,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:24,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:24,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:03:24,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:24,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 11:03:24,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:24,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:24,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 11:03:25,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:26,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:26,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:26,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:03:27,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 11:03:27,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 11:03:27,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 11:03:27,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:28,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:28,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:28,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:03:28,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:29,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 11:03:30,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:03:30,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:30,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:30,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:30,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:03:30,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:30,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 11:03:30,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:30,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:31,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 11:03:31,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:31,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:03:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 11:03:31,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 11:03:31,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:32,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:33,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:33,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:33,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:33,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:03:33,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:33,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:33,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:33,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:33,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:03:34,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:34,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 11:03:34,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:34,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 11:03:34,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:34,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:34,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 11:03:36,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 11:03:36,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:36,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:36,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:36,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:03:37,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 11:03:37,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:37,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:03:37,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 11:03:37,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:37,953 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 11:03:38,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 11:03:38,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:38,533 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 11:03:38,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:03:38,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 11:03:38,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 11:03:38,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 11:03:39,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:39,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:39,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:39,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 11:03:40,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:40,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:40,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:40,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:03:41,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 11:03:41,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:03:41,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:41,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:41,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 11:03:41,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 11:03:42,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:42,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:03:42,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:03:42,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:42,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:03:42,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:03:42,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:03:42,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 11:03:42,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:03:42,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:42,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:42,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:42,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 11:03:43,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:43,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 11:03:43,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:43,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 11:03:43,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 11:03:43,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:43,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:44,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 11:03:44,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:03:44,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:44,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:44,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:44,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:46,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:46,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:46,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 11:03:47,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:47,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:03:47,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:47,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:47,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 11:03:48,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:48,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:03:49,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:50,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:03:51,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 11:03:51,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:51,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 11:03:52,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:03:53,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:03:53,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:03:53,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:03:53,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 11:03:54,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:03:54,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 11:03:54,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 11:03:54,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:03:55,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:55,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:55,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:55,917 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 11:03:55,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:03:56,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:56,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 11:03:56,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 11:03:56,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 11:03:57,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 11:03:57,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:57,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:03:57,594 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 11:03:57,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:57,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:57,831 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 11:03:58,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:58,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:00,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:00,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 11:04:00,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:00,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:00,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:00,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:00,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:04:01,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:01,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:04:01,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:01,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:01,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:01,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:01,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:01,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:02,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:02,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:02,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:02,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:02,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:03,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 11:04:03,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:03,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:04:03,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:03,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:04:04,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:05,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:05,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:05,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 11:04:05,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:04:05,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:04:05,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:05,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 11:04:05,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 11:04:05,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:05,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:05,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:06,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:04:06,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:07,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:07,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:07,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 11:04:07,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:07,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:04:07,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 11:04:08,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:08,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 11:04:08,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 11:04:08,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 11:04:08,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:08,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:09,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:09,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:09,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:04:09,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:04:09,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:09,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:10,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 11:04:10,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:10,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:10,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:10,882 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 11:04:11,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:11,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:11,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:04:11,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:11,302 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 11:04:11,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:11,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:04:12,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:12,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 11:04:12,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:04:12,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:12,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:04:12,703 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 11:04:13,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 11:04:13,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:13,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 11:04:14,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:14,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:04:14,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:14,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:15,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:15,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:15,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:15,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:04:15,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:15,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:15,641 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 11:04:15,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 11:04:16,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:04:16,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:16,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:16,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:16,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:16,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:16,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:16,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:04:16,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:17,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:04:17,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 11:04:17,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:17,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:17,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:17,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:18,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:18,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:18,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:18,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:18,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:18,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:19,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:19,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:20,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:20,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:20,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 11:04:20,819 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 11:04:20,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:21,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 11:04:21,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 11:04:21,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:04:21,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:21,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:21,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 11:04:21,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:21,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:04:22,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:22,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:22,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:22,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:22,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:23,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:23,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:23,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:23,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:24,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:24,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:24,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:24,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 11:04:24,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:04:24,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 11:04:24,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:25,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 11:04:25,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:25,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:26,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:26,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 11:04:26,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:27,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:27,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:28,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:28,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 11:04:28,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:28,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:04:28,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:28,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 11:04:28,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:28,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 11:04:29,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:29,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:04:29,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:29,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 11:04:30,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 11:04:30,048 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 11:04:30,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:30,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:30,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:30,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:30,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:31,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:31,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 11:04:32,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:04:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:32,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:32,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:04:34,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:34,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 11:04:35,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:35,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:35,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 11:04:35,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:35,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:35,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:35,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:36,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:36,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:04:36,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:37,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:37,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 11:04:38,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:38,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 11:04:39,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 11:04:39,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:39,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:04:39,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 11:04:39,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:40,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:04:41,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:04:41,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:41,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:04:41,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:41,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:42,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 11:04:43,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 11:04:43,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:04:43,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:04:43,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:44,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 11:04:44,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:04:44,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:44,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:45,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:04:45,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:45,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 11:04:45,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:45,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:45,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:46,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 11:04:46,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:04:47,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:48,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:48,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:49,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:49,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:49,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:04:49,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:49,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:50,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:04:50,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 11:04:50,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:50,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:04:51,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:51,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 11:04:51,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:52,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:52,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:04:52,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:52,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:04:52,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:52,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:04:52,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 11:04:52,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:52,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:04:53,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:53,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:54,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 11:04:54,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:04:54,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:54,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:55,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:55,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:04:55,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:55,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 11:04:55,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 11:04:55,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 11:04:55,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:56,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:56,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:56,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:04:56,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:04:56,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:04:57,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:57,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 11:04:57,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 11:04:57,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:57,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:58,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:58,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:58,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 11:04:58,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:58,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:59,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 11:04:59,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 11:04:59,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:59,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:59,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:59,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:00,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:05:01,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:02,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:05:02,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:02,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:05:02,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:02,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:03,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:05:03,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:03,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:03,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:03,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:05:03,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:05:04,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:05:04,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:04,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:04,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:04,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:04,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 11:05:04,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:05,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:05,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:05:05,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:05,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:06,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:06,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 11:05:06,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:06,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 11:05:06,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:06,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:05:06,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:08,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 11:05:08,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:08,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:09,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 11:05:09,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:09,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:09,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 11:05:10,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 11:05:10,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:10,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:05:10,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:11,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:11,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:11,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:11,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:12,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:12,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:05:12,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:12,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 11:05:12,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:12,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:13,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:13,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:13,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:05:13,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:13,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 11:05:13,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:14,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:14,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:05:15,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:15,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:15,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:15,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 11:05:16,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:16,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:16,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 11:05:17,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:18,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:18,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:19,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:05:19,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:05:19,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 11:05:19,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 11:05:19,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 11:05:19,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:19,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:19,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 11:05:20,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:20,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:20,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:20,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 11:05:20,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 11:05:20,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:20,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 11:05:22,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 11:05:22,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:22,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 11:05:23,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 11:05:23,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:23,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:05:23,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:05:24,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:05:24,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:24,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 11:05:24,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:05:24,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:24,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 11:05:24,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:24,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:24,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:25,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:25,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 11:05:25,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 11:05:25,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:25,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 11:05:26,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:26,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:26,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:05:26,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:26,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:05:27,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:27,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:05:27,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:28,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:28,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:28,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:28,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:05:29,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:29,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:29,899 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 11:05:30,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:30,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:30,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:05:30,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:30,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:05:30,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:31,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 11:05:31,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:31,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:05:31,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:31,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:05:32,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:32,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 11:05:32,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:32,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:05:32,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:32,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:33,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:33,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:33,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:33,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:33,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:33,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:33,920 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 11:05:35,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:35,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:05:35,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:05:35,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 11:05:35,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:36,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:36,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 11:05:36,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:36,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:37,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:37,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:05:37,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:05:37,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:37,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 11:05:37,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:37,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 11:05:38,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:38,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:38,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:39,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 11:05:39,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:39,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:05:39,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:39,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:39,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:39,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 11:05:40,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 11:05:40,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:40,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:40,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:40,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:05:40,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:41,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:41,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:42,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 11:05:42,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:05:42,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:05:42,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 11:05:42,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:42,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:43,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:43,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:43,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:44,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:44,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:05:44,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:44,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:45,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 11:05:45,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:45,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:45,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:45,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 11:05:46,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 11:05:46,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:46,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:46,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:47,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:47,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:05:48,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 11:05:48,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:49,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:49,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:49,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:05:50,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:50,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:05:50,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:05:51,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:05:51,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:05:52,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:52,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:52,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:05:52,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 11:05:53,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:53,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:53,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:05:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:05:53,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:53,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:05:54,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:54,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:54,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:54,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 11:05:55,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:05:55,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:55,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:56,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:05:56,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:05:56,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:05:56,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:56,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:56,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:57,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:05:57,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 11:05:57,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:58,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:58,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:59,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 11:05:59,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 11:05:59,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:59,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:59,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:00,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 11:06:00,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 11:06:00,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 11:06:00,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:00,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:02,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:02,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:06:02,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:06:02,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 11:06:03,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:03,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:03,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:06:03,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:04,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:06:04,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 11:06:05,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:05,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:05,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:05,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:06:06,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:06,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:06,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:06,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:06:06,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:06,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:06,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:06,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:06:07,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 11:06:07,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 11:06:07,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:07,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:07,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:07,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:07,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 11:06:08,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 11:06:08,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:09,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 11:06:09,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:06:10,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:10,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:10,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:10,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 11:06:11,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 11:06:11,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:11,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:11,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:06:11,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:06:11,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:12,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:12,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:12,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 11:06:12,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:12,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 11:06:12,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:12,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:12,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:06:12,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:12,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:12,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:12,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:13,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:13,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 11:06:13,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:13,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:13,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:13,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:13,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:14,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:14,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:14,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:15,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 11:06:15,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:15,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 11:06:15,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:15,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 11:06:16,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 11:06:16,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:16,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:16,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:16,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:17,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:17,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:17,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:06:17,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:18,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:18,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:19,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:19,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:06:20,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:21,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:21,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:22,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 11:06:22,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 11:06:22,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:22,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 11:06:22,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:22,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 11:06:23,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:23,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 11:06:23,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:24,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:24,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:25,096 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 11:06:25,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:25,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 11:06:25,335 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 11:06:25,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:25,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:25,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:25,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:26,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 11:06:26,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:26,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:26,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:26,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:06:26,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:28,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:28,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:06:29,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 11:06:30,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 11:06:30,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 11:06:30,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:30,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:31,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:06:31,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:31,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:31,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:31,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 11:06:32,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:32,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:32,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 11:06:33,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:34,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:35,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:35,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:36,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:36,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 11:06:36,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:06:36,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 11:06:36,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:06:36,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 11:06:36,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:36,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:36,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:36,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:37,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:37,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:06:37,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:06:37,471 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 11:06:37,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:06:37,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:38,035 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 11:06:38,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:06:38,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:39,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 11:06:39,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:39,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:06:39,475 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 11:06:39,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:06:39,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 11:06:39,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:39,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:40,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:40,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:06:40,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:40,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 11:06:40,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:40,841 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 11:06:41,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:06:41,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:06:42,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:06:42,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:42,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:43,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:43,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:43,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:06:44,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 11:06:44,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:06:44,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:44,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:06:44,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:06:44,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:45,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:45,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:06:45,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:06:45,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:06:46,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:46,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:46,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:06:47,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:06:47,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 11:06:47,341 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 11:06:47,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:06:48,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 11:06:48,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:49,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:50,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:50,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:50,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:50,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:06:50,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 11:06:50,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:06:51,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:51,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 11:06:51,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:52,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 11:06:52,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:52,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:06:53,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 11:06:53,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 11:06:53,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:53,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:53,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:53,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:54,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 11:06:54,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 11:06:55,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 11:06:55,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 11:06:55,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:55,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:55,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:55,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:06:55,588 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 11:06:56,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:56,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:06:56,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:56,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:06:57,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:06:57,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:57,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:57,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 11:06:57,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:57,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:06:57,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:06:57,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:57,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 11:06:58,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:58,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 11:06:58,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:58,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:06:58,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 11:06:59,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:59,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:59,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:06:59,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 11:06:59,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:59,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:07:00,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 11:07:00,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:00,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:00,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:01,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:01,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:01,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:03,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:03,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:03,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:04,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:04,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:04,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:04,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:04,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:05,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 11:07:05,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:05,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 11:07:05,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 11:07:05,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 11:07:05,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:06,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:06,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:06,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:06,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:07,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:07:07,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:07:07,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:07,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:07:08,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:08,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:08,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 11:07:08,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 11:07:08,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:09,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 11:07:09,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:09,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:10,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:10,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:10,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 11:07:11,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:11,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:11,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 11:07:11,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:11,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 11:07:11,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:12,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:12,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:12,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 11:07:12,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:12,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:07:12,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:07:12,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 11:07:13,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:13,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:07:13,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:07:13,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 11:07:13,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:13,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:07:13,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:13,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:13,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 11:07:13,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:14,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:14,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 11:07:14,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:15,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:15,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:07:15,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:15,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:15,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 11:07:16,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 11:07:16,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:17,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:17,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:17,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:07:17,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:18,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:18,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 11:07:18,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:18,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:18,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:18,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:18,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:07:18,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 11:07:18,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:19,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:07:19,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:07:19,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:20,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:20,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:20,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 11:07:20,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:20,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:21,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:07:21,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:22,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:23,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 11:07:23,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:24,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:07:24,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:24,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 11:07:25,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:07:25,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:25,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:07:26,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:26,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:07:26,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 11:07:26,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:27,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:27,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:28,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:28,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:28,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:07:28,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:28,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:28,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:28,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:29,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:29,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:30,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 11:07:30,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 11:07:30,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:30,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:30,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:30,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:07:30,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:30,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:31,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:07:31,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:32,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:32,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:32,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 11:07:32,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:07:32,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 11:07:32,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:32,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:07:33,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:33,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:33,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 11:07:33,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:07:33,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:07:34,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:34,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:34,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:35,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:35,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:35,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:35,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:35,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 11:07:35,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:37,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:37,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:37,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:38,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:38,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 11:07:38,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:07:39,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:07:39,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:39,173 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 11:07:39,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:07:39,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:40,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 11:07:40,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:07:40,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 11:07:40,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:07:40,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:41,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:41,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:41,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:07:41,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:41,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:41,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 11:07:41,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 11:07:42,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:42,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:42,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:07:42,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:42,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:42,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 11:07:42,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 11:07:42,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 11:07:42,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:42,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 11:07:42,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 11:07:44,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:44,187 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 11:07:44,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:44,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:44,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:44,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 11:07:44,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:44,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:45,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:45,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:45,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:07:45,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:45,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:45,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:46,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:46,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 11:07:46,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:07:47,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:47,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:48,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:07:48,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:48,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:07:49,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:49,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:07:49,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:50,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:07:50,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:07:50,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:07:51,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 11:07:51,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:51,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:52,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:52,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 11:07:53,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:53,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:07:53,952 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 11:07:54,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:54,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:54,377 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 11:07:54,472 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 11:07:54,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:54,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:54,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:54,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:54,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:54,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:55,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 11:07:55,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:55,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:55,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:55,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 11:07:55,558 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 11:07:55,564 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 11:07:55,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 11:07:55,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:56,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:07:56,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:56,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:56,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 11:07:56,711 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 11:07:56,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:57,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:57,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:58,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:58,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 11:07:58,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 11:07:58,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 11:07:58,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 11:07:58,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:07:58,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:59,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 11:07:59,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:59,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:59,424 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 11:07:59,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:59,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 11:07:59,913 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 11:08:00,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 11:08:00,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 11:08:00,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 11:08:00,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:00,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:00,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:00,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:01,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 11:08:01,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 11:08:01,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:01,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:08:01,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:01,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:01,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:01,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 11:08:01,884 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 11:08:02,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:03,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:03,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 11:08:04,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:04,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:05,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:08:05,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 11:08:05,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:08:05,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:05,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:08:05,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:05,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 11:08:06,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 11:08:06,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 11:08:06,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:06,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 11:08:06,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:08:06,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:07,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 11:08:07,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:08,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:08,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:08:08,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:08,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:09,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:09,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:09,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:10,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:10,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 11:08:10,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:10,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:10,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:10,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:11,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:08:11,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:11,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:12,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:12,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:12,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:12,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:08:13,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 11:08:13,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 11:08:13,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:08:13,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:13,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 11:08:13,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:14,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:14,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 11:08:14,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:14,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:14,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:08:14,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:14,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:15,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:08:15,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 11:08:15,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:08:15,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:15,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:16,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:16,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:08:16,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:08:16,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 11:08:17,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:18,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:18,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 11:08:18,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 11:08:18,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:19,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:19,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:19,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:08:19,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:19,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:19,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:21,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:21,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:21,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:21,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:08:21,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:22,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:08:23,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:08:23,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:24,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:24,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 11:08:24,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:24,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:25,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:25,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:25,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:25,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 11:08:25,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:08:25,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:26,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:08:26,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:26,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:26,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:08:26,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:27,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 11:08:27,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 11:08:27,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 11:08:27,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 11:08:28,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 11:08:28,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:28,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:28,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:29,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:29,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:30,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:30,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:30,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:30,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:30,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:30,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:31,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:32,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 11:08:32,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 11:08:32,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:08:32,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 11:08:32,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 11:08:33,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:33,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 11:08:33,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:34,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:34,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:34,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:08:34,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 11:08:34,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:35,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:35,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:35,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:35,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 11:08:35,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 11:08:35,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:35,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 11:08:35,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 11:08:36,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:36,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:36,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:36,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:36,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:08:36,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:08:36,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 11:08:36,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:36,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:08:37,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 11:08:37,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:37,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 11:08:37,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:08:37,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:38,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:38,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:38,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:08:38,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:38,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:08:39,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:39,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:39,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:08:39,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:08:39,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:40,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 11:08:40,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:40,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:08:40,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:41,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:42,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 11:08:42,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:42,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:42,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:42,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:43,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 11:08:43,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:08:43,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:44,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:44,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:08:45,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:08:45,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 11:08:45,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:08:46,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:46,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:46,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:47,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:08:47,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:47,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 11:08:47,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:47,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:47,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:47,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:47,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:47,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 11:08:47,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 11:08:48,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 11:08:48,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:48,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:48,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:48,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:49,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:08:50,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:50,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:50,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:50,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:50,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:50,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:50,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 11:08:51,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:08:52,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 11:08:52,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:52,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 11:08:52,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:52,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 11:08:52,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 11:08:52,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:52,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:08:53,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:08:53,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:53,239 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 11:08:53,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:53,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 11:08:54,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:54,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:54,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 11:08:54,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:54,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:08:55,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:55,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:56,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:56,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:56,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:56,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 11:08:56,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 11:08:57,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:08:57,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 11:08:57,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:58,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:08:58,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:58,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 11:08:58,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:58,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:58,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:59,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:08:59,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:08:59,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:59,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:00,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:00,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:00,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:09:00,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:09:00,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:09:00,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 11:09:01,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:01,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:01,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:02,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:02,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:09:02,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 11:09:02,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 11:09:03,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:03,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:03,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:03,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:04,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:04,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:09:05,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:06,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 11:09:06,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:09:06,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:07,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 11:09:07,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:08,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:08,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 11:09:08,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:08,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:08,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:08,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:08,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 11:09:08,977 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 11:09:09,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:09,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:09,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:09,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 11:09:09,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:10,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 11:09:10,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:10,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:11,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:11,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:11,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 11:09:11,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:11,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 11:09:12,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:12,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:12,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:13,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 11:09:14,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:09:14,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 11:09:14,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:14,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:14,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:14,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:14,932 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 11:09:14,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 11:09:15,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 11:09:15,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:16,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:16,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:16,511 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 11:09:16,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:16,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:17,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:17,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 11:09:17,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 11:09:17,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:17,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:17,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:17,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:09:17,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 11:09:18,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 11:09:18,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:18,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:18,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 11:09:18,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:19,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:19,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:09:20,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:20,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:20,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:20,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 11:09:20,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 11:09:20,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 11:09:21,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:09:21,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:21,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 11:09:22,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:22,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:22,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:09:22,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:22,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:23,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 11:09:23,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:23,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:09:23,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:09:23,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:24,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:24,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:09:24,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:24,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:09:24,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:24,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:24,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:25,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:25,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:25,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:25,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:09:25,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:26,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 11:09:26,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 11:09:26,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:26,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:27,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:27,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:27,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:09:27,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:27,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:27,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:28,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:28,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:28,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 11:09:28,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:29,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:29,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:29,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:29,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:29,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:09:29,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:29,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:29,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:30,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:09:30,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:30,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:30,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:30,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 11:09:31,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 11:09:31,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:31,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:31,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:31,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:31,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:32,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 11:09:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:33,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:34,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:09:34,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:34,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:34,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:09:34,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:09:34,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 11:09:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:35,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:09:35,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:09:35,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:35,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 11:09:36,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:36,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:36,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:36,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 11:09:36,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 11:09:36,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:09:37,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:37,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 11:09:38,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:38,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:09:38,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:09:38,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 11:09:38,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:38,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 11:09:38,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:38,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:38,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 11:09:40,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:41,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:09:41,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:41,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 11:09:41,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:42,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:42,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:42,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:09:43,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 11:09:43,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 11:09:44,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 11:09:44,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 11:09:44,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:44,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:44,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:44,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:44,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:44,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 11:09:45,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 11:09:45,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:09:45,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:09:45,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:45,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:45,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:46,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:46,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 11:09:47,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:09:47,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:47,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 11:09:47,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 11:09:47,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 11:09:47,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:09:48,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:09:48,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:48,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:48,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:09:48,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:09:48,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 11:09:48,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:48,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:09:49,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:49,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 11:09:49,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:09:49,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:09:49,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 11:09:50,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:50,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:09:51,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 11:09:51,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:09:51,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:09:51,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:51,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:51,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:09:51,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:51,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:09:52,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:09:52,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:52,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:09:52,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 11:09:52,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 11:09:52,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:09:53,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 11:09:53,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:53,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:09:53,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:53,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:54,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:54,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:54,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:54,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:54,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:55,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:09:55,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:56,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:09:56,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:56,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:56,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:56,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 11:09:56,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 11:09:57,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:57,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:09:57,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:09:57,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:09:57,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:57,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:09:57,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:58,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:09:58,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:58,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:58,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:58,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 11:09:58,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:09:59,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:09:59,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:00,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:00,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:00,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:00,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:10:00,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:10:00,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:01,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:10:01,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:01,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 11:10:01,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:02,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 11:10:02,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:10:03,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:03,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:03,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 11:10:03,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 11:10:03,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:04,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 11:10:04,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:04,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:10:04,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 11:10:04,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:04,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:04,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 11:10:04,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:04,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:04,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 11:10:05,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 11:10:05,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:05,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 11:10:05,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:05,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:05,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:05,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 11:10:05,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 11:10:05,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 11:10:05,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:05,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:05,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 11:10:06,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:06,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:06,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:07,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:10:07,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 11:10:07,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:07,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:08,162 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 11:10:08,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:08,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:08,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:09,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 11:10:09,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:09,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:09,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:09,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:10:09,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:09,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:10,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:10,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 11:10:11,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:12,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:12,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:12,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:12,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:12,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:10:12,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:12,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:14,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:14,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 11:10:14,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:14,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:14,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:14,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 11:10:14,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:14,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:15,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:15,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:10:15,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:10:16,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 11:10:16,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:10:16,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:16,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 11:10:16,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:17,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:17,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:17,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:17,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 11:10:17,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 11:10:17,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:17,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:18,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:18,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 11:10:18,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:18,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 11:10:19,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:19,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:19,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:19,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:19,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:19,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:19,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:19,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:19,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:19,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 11:10:20,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:20,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:21,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:21,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 11:10:21,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:10:21,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:21,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:21,774 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 11:10:22,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:22,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 11:10:22,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:22,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:22,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:22,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 11:10:22,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 11:10:23,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:23,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:23,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:23,627 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 11:10:23,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:24,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 11:10:24,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 11:10:24,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:24,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:24,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:25,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 11:10:25,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 11:10:25,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:25,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:26,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:26,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 11:10:26,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:26,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:26,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:10:27,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:27,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:27,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 11:10:28,250 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 11:10:28,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:28,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 11:10:28,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 11:10:28,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:29,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:30,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 11:10:30,184 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 11:10:30,206 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 11:10:30,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 11:10:30,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:30,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 11:10:31,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 11:10:31,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:10:31,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:31,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 11:10:32,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:10:32,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 11:10:32,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:32,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:32,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:33,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:33,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:10:33,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:33,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 11:10:33,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 11:10:33,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 11:10:33,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:33,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 11:10:34,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:34,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:10:34,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:34,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:35,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:10:35,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 11:10:35,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:35,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:35,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:10:35,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:35,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:35,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:10:35,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:35,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 11:10:36,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:10:36,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:10:36,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:36,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 11:10:36,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:10:37,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:10:37,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 11:10:38,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:38,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:39,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:39,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:39,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:39,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 11:10:40,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:41,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:10:41,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:10:41,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:41,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:41,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 11:10:42,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:42,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:10:42,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:42,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:10:42,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:10:43,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:10:43,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:43,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:43,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:43,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:10:44,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:44,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 11:10:44,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:44,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:45,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:45,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:10:45,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:45,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 11:10:45,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:10:45,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:46,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 11:10:46,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:10:46,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:10:46,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 11:10:47,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 11:10:47,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 11:10:47,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:47,801 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 11:10:47,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:48,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:48,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:10:48,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 11:10:48,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:48,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:48,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 11:10:48,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 11:10:49,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 11:10:49,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 11:10:49,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:10:50,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:10:50,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:50,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 11:10:50,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:50,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:10:50,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:50,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:10:51,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:51,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:51,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:51,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:51,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:52,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:52,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 11:10:52,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:52,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:53,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:53,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:10:53,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:10:54,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:54,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:54,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:54,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:55,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:55,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:10:56,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:56,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:10:56,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 11:10:56,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:56,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:28,546 INFO [train.py:1379] (2/4) Maximum memory allocated so far is 19599MB 2023-09-28 11:11:31,720 INFO [train.py:1379] (2/4) Maximum memory allocated so far is 19893MB 2023-09-28 11:11:36,040 INFO [train.py:1379] (2/4) Maximum memory allocated so far is 19893MB 2023-09-28 11:11:39,574 INFO [train.py:1379] (2/4) Maximum memory allocated so far is 19893MB 2023-09-28 11:11:51,709 INFO [train.py:1379] (2/4) Maximum memory allocated so far is 19893MB 2023-09-28 11:11:58,976 INFO [train.py:1379] (2/4) Maximum memory allocated so far is 19893MB 2023-09-28 11:12:16,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:12:16,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 11:12:16,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 11:12:16,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:17,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:12:17,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:17,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:12:18,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:12:18,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 11:12:18,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 11:12:18,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 11:12:18,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:12:18,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 11:12:18,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 11:12:18,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:19,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:19,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:20,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:20,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:12:20,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:20,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:20,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:20,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:20,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:20,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:12:20,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:20,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:12:21,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 11:12:21,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:22,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:22,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 11:12:22,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 11:12:22,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:12:22,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:22,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 11:12:22,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 11:12:22,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:24,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:12:24,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:24,505 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 11:12:24,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 11:12:24,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:12:24,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:24,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 11:12:25,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 11:12:25,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 11:12:25,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:30,489 INFO [train.py:1039] (2/4) Epoch 1, batch 0, loss[loss=9.359, simple_loss=8.501, pruned_loss=8.561, over 24528.00 frames. ], tot_loss[loss=9.359, simple_loss=8.501, pruned_loss=8.561, over 24528.00 frames. ], batch size: 63, lr: 2.25e-02, grad_scale: 1.0 2023-09-28 11:12:30,490 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 11:12:44,941 INFO [train.py:1071] (2/4) Epoch 1, validation: loss=9.318, simple_loss=8.466, pruned_loss=8.496, over 1125622.00 frames. 2023-09-28 11:12:44,943 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 19893MB 2023-09-28 11:12:47,870 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=5.53 vs. limit=7.5 2023-09-28 11:12:48,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=0.0, ans=0.2 2023-09-28 11:12:50,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 11:12:50,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:12:50,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=0.0, ans=0.3 2023-09-28 11:12:52,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:12:57,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=0.0, ans=0.5 2023-09-28 11:12:58,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:58,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:13:01,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:01,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 11:13:02,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 11:13:06,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:06,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:06,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=66.66666666666667, ans=0.496875 2023-09-28 11:13:10,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:11,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:11,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:13:11,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:13,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 11:13:16,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:17,696 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=374.44 vs. limit=7.55 2023-09-28 11:13:19,785 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=60.07 vs. limit=5.066666666666666 2023-09-28 11:13:21,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=133.33333333333334, ans=0.49375 2023-09-28 11:13:26,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:13:26,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:28,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 11:13:32,702 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=211.93 vs. limit=7.55 2023-09-28 11:13:33,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:13:33,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:13:36,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:43,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:13:47,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:48,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=105.59 vs. limit=5.05 2023-09-28 11:13:48,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=277.86 vs. limit=7.575 2023-09-28 11:13:49,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=373.46 vs. limit=7.65 2023-09-28 11:13:54,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 11:13:57,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 11:13:57,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:13:57,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:13:59,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:13:59,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:01,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=240.76 vs. limit=7.7 2023-09-28 11:14:02,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 11:14:03,267 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=407.27 vs. limit=7.6 2023-09-28 11:14:03,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=279.23 vs. limit=7.6 2023-09-28 11:14:04,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:04,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:04,798 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=11.42 vs. limit=4.1066666666666665 2023-09-28 11:14:05,034 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=86.58 vs. limit=5.133333333333334 2023-09-28 11:14:07,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:14:11,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 11:14:13,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=266.6666666666667, ans=0.5 2023-09-28 11:14:14,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:14:16,929 INFO [train.py:1039] (2/4) Epoch 1, batch 50, loss[loss=1.306, simple_loss=1.173, pruned_loss=1.208, over 23409.00 frames. ], tot_loss[loss=3.824, simple_loss=3.52, pruned_loss=2.989, over 1065802.06 frames. ], batch size: 285, lr: 2.48e-02, grad_scale: 0.25 2023-09-28 11:14:19,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:21,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:14:21,945 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=333.42 vs. limit=7.625 2023-09-28 11:14:22,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 11:14:22,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:14:22,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:14:23,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.17 vs. limit=4.133333333333334 2023-09-28 11:14:26,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:26,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:26,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=333.3333333333333, ans=0.09791666666666667 2023-09-28 11:14:31,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:32,798 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=116.10 vs. limit=7.625 2023-09-28 11:14:33,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=400.0, ans=0.0975 2023-09-28 11:14:35,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 11:14:35,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:44,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:14:44,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 11:14:46,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 11:14:49,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:14:51,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:14:51,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:51,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:53,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:14:53,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:14:53,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:54,448 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=483.98 vs. limit=7.675 2023-09-28 11:15:02,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:03,664 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=157.76 vs. limit=7.675 2023-09-28 11:15:03,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=242.10 vs. limit=7.675 2023-09-28 11:15:04,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:04,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:15:04,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 11:15:05,515 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=21.80 vs. limit=5.116666666666666 2023-09-28 11:15:06,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:15:08,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:15:08,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 11:15:09,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:10,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 11:15:10,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=533.3333333333334, ans=0.475 2023-09-28 11:15:14,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=533.3333333333334, ans=7.7 2023-09-28 11:15:14,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=97.81 vs. limit=7.7 2023-09-28 11:15:19,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:15:19,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:21,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:23,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:15:23,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:25,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 11:15:25,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 11:15:27,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:27,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:32,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:15:32,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:34,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 11:15:34,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 11:15:34,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=600.0, ans=0.756 2023-09-28 11:15:34,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=600.0, ans=0.471875 2023-09-28 11:15:36,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:15:36,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:37,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=600.0, ans=0.471875 2023-09-28 11:15:38,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:15:40,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 11:15:40,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 11:15:40,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:42,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:42,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:15:43,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:15:47,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:15:49,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=383.20 vs. limit=7.75 2023-09-28 11:15:49,917 INFO [train.py:1039] (2/4) Epoch 1, batch 100, loss[loss=1.233, simple_loss=1.065, pruned_loss=1.343, over 24487.00 frames. ], tot_loss[loss=2.408, simple_loss=2.186, pruned_loss=2.052, over 1873462.00 frames. ], batch size: 66, lr: 2.70e-02, grad_scale: 0.5 2023-09-28 11:15:51,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:15:55,754 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 2.173e+02 3.855e+02 5.319e+03 2.503e+05, threshold=7.710e+02, percent-clipped=0.0 2023-09-28 11:15:55,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:15:56,659 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=35.94 vs. limit=7.75 2023-09-28 11:15:57,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 11:15:59,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:16:02,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:16:04,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:04,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:16:04,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:16:04,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:06,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 11:16:08,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:16:08,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:08,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:08,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:16:08,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=733.3333333333334, ans=0.8743333333333334 2023-09-28 11:16:11,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=733.3333333333334, ans=0.5 2023-09-28 11:16:13,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 11:16:16,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:18,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:18,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:16:19,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.71 vs. limit=3.11 2023-09-28 11:16:20,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:16:22,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=247.96 vs. limit=7.775 2023-09-28 11:16:24,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=245.89 vs. limit=7.775 2023-09-28 11:16:25,344 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 11:16:25,379 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 11:16:27,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:16:27,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:16:31,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:16:35,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:39,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:41,580 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=15.10 vs. limit=7.8 2023-09-28 11:16:43,215 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=51.00 vs. limit=7.8 2023-09-28 11:16:45,468 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=17.70 vs. limit=7.825 2023-09-28 11:16:46,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:46,243 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 11:16:48,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.78 vs. limit=5.216666666666667 2023-09-28 11:16:49,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:16:54,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:16:54,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:16:56,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:59,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=21.79 vs. limit=5.433333333333334 2023-09-28 11:17:01,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:02,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=866.6666666666666, ans=0.459375 2023-09-28 11:17:02,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.68 vs. limit=5.216666666666667 2023-09-28 11:17:05,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:07,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=933.3333333333334, ans=0.45625 2023-09-28 11:17:09,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:17:09,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=61.19 vs. limit=7.85 2023-09-28 11:17:10,131 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.33 vs. limit=5.466666666666667 2023-09-28 11:17:11,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:11,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:13,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:13,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:17:13,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:14,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 11:17:14,820 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 11:17:17,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:17,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:17:17,952 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.90 vs. limit=8.2 2023-09-28 11:17:18,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=167.26 vs. limit=7.85 2023-09-28 11:17:19,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:19,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:19,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:17:19,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:17:19,819 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=55.82 vs. limit=7.85 2023-09-28 11:17:20,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:17:20,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:20,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:22,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:22,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:17:24,259 INFO [train.py:1039] (2/4) Epoch 1, batch 150, loss[loss=1.109, simple_loss=0.9413, pruned_loss=1.209, over 24389.00 frames. ], tot_loss[loss=1.845, simple_loss=1.652, pruned_loss=1.681, over 2508750.79 frames. ], batch size: 77, lr: 2.93e-02, grad_scale: 0.5 2023-09-28 11:17:24,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:17:24,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1000.0, ans=0.046875 2023-09-28 11:17:27,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:29,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:29,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:17:29,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:34,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=128.74 vs. limit=7.875 2023-09-28 11:17:36,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:36,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:39,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1000.0, ans=0.453125 2023-09-28 11:17:41,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:17:43,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:47,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 11:17:47,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 11:17:47,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 11:17:50,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:17:50,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:17:54,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:17:54,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:55,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:55,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:57,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:57,877 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 11:17:58,676 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.75 vs. limit=5.266666666666667 2023-09-28 11:18:01,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:04,646 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=8.35 2023-09-28 11:18:06,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.81 vs. limit=4.453333333333333 2023-09-28 11:18:07,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:10,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:18:10,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 11:18:15,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:18:15,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:15,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:17,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:18:18,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:18:20,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1200.0, ans=0.288 2023-09-28 11:18:22,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:18:22,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:22,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 11:18:29,145 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=34.95 vs. limit=8.4 2023-09-28 11:18:31,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:31,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:18:32,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=30.64 vs. limit=7.95 2023-09-28 11:18:33,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:18:33,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:18:37,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:39,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:18:40,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:18:44,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:18:46,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:18:47,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:18:47,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 11:18:50,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:50,154 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 11:18:50,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1266.6666666666667, ans=0.440625 2023-09-28 11:18:54,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:54,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1266.6666666666667, ans=0.28733333333333333 2023-09-28 11:18:58,371 INFO [train.py:1039] (2/4) Epoch 1, batch 200, loss[loss=0.9146, simple_loss=0.7731, pruned_loss=0.9468, over 23250.00 frames. ], tot_loss[loss=1.529, simple_loss=1.353, pruned_loss=1.442, over 3017340.69 frames. ], batch size: 105, lr: 3.15e-02, grad_scale: 1.0 2023-09-28 11:18:59,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=57.16 vs. limit=8.0 2023-09-28 11:19:00,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:19:01,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:19:03,422 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 9.506e+01 1.160e+02 1.347e+02 1.565e+02 3.276e+02, threshold=2.693e+02, percent-clipped=0.0 2023-09-28 11:19:04,812 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=132.99 vs. limit=8.0 2023-09-28 11:19:05,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 11:19:05,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:05,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:09,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 11:19:10,306 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=196.70 vs. limit=8.0 2023-09-28 11:19:11,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:19:11,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1333.3333333333333, ans=0.4375 2023-09-28 11:19:12,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:13,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:13,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1333.3333333333333, ans=0.4375 2023-09-28 11:19:18,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:19:18,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:19:18,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:34,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=8.78 vs. limit=4.586666666666667 2023-09-28 11:19:40,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.04 vs. limit=8.6 2023-09-28 11:19:41,956 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.69 vs. limit=8.6 2023-09-28 11:19:46,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:19:46,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:19:48,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:19:48,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:19:50,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:19:50,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:19:51,126 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=47.43 vs. limit=8.05 2023-09-28 11:19:52,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:52,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:19:52,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:52,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:19:54,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 11:19:55,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:19:55,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:56,650 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=14.80 vs. limit=5.766666666666667 2023-09-28 11:19:58,366 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=17.14 vs. limit=8.075 2023-09-28 11:20:00,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:20:02,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=27.52 vs. limit=8.075 2023-09-28 11:20:02,773 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.41 vs. limit=8.65 2023-09-28 11:20:07,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=74.63 vs. limit=8.075 2023-09-28 11:20:09,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:20:10,056 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.25 vs. limit=8.075 2023-09-28 11:20:18,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:18,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:20:24,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.62 vs. limit=4.64 2023-09-28 11:20:27,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:28,016 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=43.00 vs. limit=8.1 2023-09-28 11:20:29,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 11:20:29,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:29,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:20:29,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:20:29,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:20:30,150 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.48 vs. limit=4.666666666666667 2023-09-28 11:20:30,833 INFO [train.py:1039] (2/4) Epoch 1, batch 250, loss[loss=0.9527, simple_loss=0.8013, pruned_loss=0.947, over 24297.00 frames. ], tot_loss[loss=1.336, simple_loss=1.171, pruned_loss=1.278, over 3396522.99 frames. ], batch size: 74, lr: 3.38e-02, grad_scale: 1.0 2023-09-28 11:20:31,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 11:20:32,420 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=100.27 vs. limit=5.0 2023-09-28 11:20:32,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:20:32,787 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 11:20:32,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:36,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:20:38,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:40,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:40,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1666.6666666666667, ans=0.421875 2023-09-28 11:20:41,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:20:42,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1666.6666666666667, ans=0.421875 2023-09-28 11:20:43,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:45,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:20:48,492 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=11.23 vs. limit=4.693333333333333 2023-09-28 11:20:51,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:20:52,496 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.16 vs. limit=8.15 2023-09-28 11:20:54,151 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.91 vs. limit=5.866666666666666 2023-09-28 11:20:56,005 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=27.40 vs. limit=8.8 2023-09-28 11:20:56,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=78.70 vs. limit=8.15 2023-09-28 11:21:03,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:06,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:21:07,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:21:08,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1800.0, ans=0.0595 2023-09-28 11:21:13,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=48.09 vs. limit=8.175 2023-09-28 11:21:14,033 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=190.34 vs. limit=5.9 2023-09-28 11:21:15,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:21:15,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:21:17,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:21:17,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:18,290 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=210.71 vs. limit=5.9 2023-09-28 11:21:19,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:21:19,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:21:19,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:23,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:21:25,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=21.26 vs. limit=8.2 2023-09-28 11:21:26,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 11:21:26,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:26,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1866.6666666666667, ans=0.2813333333333333 2023-09-28 11:21:29,011 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=53.27 vs. limit=8.2 2023-09-28 11:21:29,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:21:29,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:21:29,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:21:29,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:21:31,002 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=40.08 vs. limit=5.933333333333334 2023-09-28 11:21:32,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:21:32,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:21:32,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1866.6666666666667, ans=0.4125 2023-09-28 11:21:34,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:35,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:21:35,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:38,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten.whitening_limit, batch_count=1866.6666666666667, ans=8.2 2023-09-28 11:21:42,217 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=18.09 vs. limit=8.9 2023-09-28 11:21:43,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:21:45,869 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.84 vs. limit=4.773333333333333 2023-09-28 11:21:46,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:50,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:21:51,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.61 vs. limit=5.966666666666667 2023-09-28 11:21:56,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:57,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:22:02,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=2000.0, ans=0.125 2023-09-28 11:22:02,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=54.25 vs. limit=8.25 2023-09-28 11:22:03,709 INFO [train.py:1039] (2/4) Epoch 1, batch 300, loss[loss=0.9773, simple_loss=0.8141, pruned_loss=0.9535, over 24441.00 frames. ], tot_loss[loss=1.205, simple_loss=1.046, pruned_loss=1.16, over 3700505.74 frames. ], batch size: 69, lr: 3.60e-02, grad_scale: 2.0 2023-09-28 11:22:03,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 11:22:03,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:05,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:22:06,598 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=25.54 vs. limit=5.5 2023-09-28 11:22:07,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 11:22:07,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:22:09,477 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 8.573e+01 1.074e+02 1.349e+02 1.820e+02 4.135e+02, threshold=2.699e+02, percent-clipped=10.0 2023-09-28 11:22:09,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:22:09,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 11:22:10,660 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.51 vs. limit=8.25 2023-09-28 11:22:11,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=2000.0, ans=6.25 2023-09-28 11:22:13,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:22:13,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:22:14,711 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=33.52 vs. limit=6.0 2023-09-28 11:22:17,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:22:17,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 11:22:18,626 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=45.56 vs. limit=9.0 2023-09-28 11:22:19,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:22:19,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=2000.0, ans=0.125 2023-09-28 11:22:20,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:22:20,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 11:22:20,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:22,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=2066.6666666666665, ans=0.231 2023-09-28 11:22:26,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:22:32,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:22:32,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 11:22:36,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 11:22:37,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:39,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:42,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:42,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 11:22:42,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:22:43,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:22:47,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:22:48,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:53,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:22:53,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 11:22:54,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=2133.3333333333335, ans=0.4 2023-09-28 11:22:56,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:22:57,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:59,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 11:23:01,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:08,340 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=20.40 vs. limit=8.325 2023-09-28 11:23:08,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:23:11,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:23:11,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 11:23:16,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:16,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:23:19,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:20,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:23:21,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 11:23:21,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:23:21,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:23:25,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 11:23:27,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:27,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:30,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:30,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:31,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:36,549 INFO [train.py:1039] (2/4) Epoch 1, batch 350, loss[loss=0.9634, simple_loss=0.7927, pruned_loss=0.9313, over 24643.00 frames. ], tot_loss[loss=1.115, simple_loss=0.9589, pruned_loss=1.072, over 3926117.82 frames. ], batch size: 73, lr: 3.83e-02, grad_scale: 2.0 2023-09-28 11:23:38,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:38,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:23:40,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:43,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=59.68 vs. limit=8.375 2023-09-28 11:23:43,459 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=339.49 vs. limit=8.375 2023-09-28 11:23:48,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:51,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:52,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:52,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=2333.3333333333335, ans=0.27666666666666667 2023-09-28 11:23:56,109 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.62 vs. limit=8.4 2023-09-28 11:23:56,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.62 vs. limit=9.3 2023-09-28 11:23:57,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 11:23:57,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:57,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 11:23:58,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.36 vs. limit=4.96 2023-09-28 11:23:59,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=49.15 vs. limit=9.3 2023-09-28 11:24:00,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:00,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 11:24:03,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:04,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 11:24:05,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=2400.0, ans=6.2 2023-09-28 11:24:08,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:24:10,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:10,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:24:12,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:12,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:12,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=2466.6666666666665, ans=0.2753333333333333 2023-09-28 11:24:14,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:24:14,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:24:14,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:14,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=2466.6666666666665, ans=0.5 2023-09-28 11:24:19,245 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=38.54 vs. limit=9.35 2023-09-28 11:24:24,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:24:24,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:24:26,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:24:26,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:26,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.14 vs. limit=6.233333333333333 2023-09-28 11:24:32,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 11:24:32,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:33,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=2533.3333333333335, ans=0.27466666666666667 2023-09-28 11:24:37,110 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.96 vs. limit=9.4 2023-09-28 11:24:40,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:40,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:24:42,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:24:43,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 11:24:46,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:46,425 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 11:24:47,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=73.63 vs. limit=8.45 2023-09-28 11:24:48,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 11:24:48,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:53,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:53,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 11:24:53,723 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=28.60 vs. limit=8.475 2023-09-28 11:24:54,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:55,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=146.49 vs. limit=8.475 2023-09-28 11:24:57,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:24:59,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:00,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:00,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:01,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=18.80 vs. limit=8.475 2023-09-28 11:25:01,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=2600.0, ans=9.45 2023-09-28 11:25:01,884 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.61 vs. limit=9.45 2023-09-28 11:25:03,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:06,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=148.20 vs. limit=8.475 2023-09-28 11:25:07,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=23.01 vs. limit=8.475 2023-09-28 11:25:08,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:25:10,535 INFO [train.py:1039] (2/4) Epoch 1, batch 400, loss[loss=0.8372, simple_loss=0.6809, pruned_loss=0.8006, over 18893.00 frames. ], tot_loss[loss=1.048, simple_loss=0.8943, pruned_loss=1.002, over 4087632.06 frames. ], batch size: 41, lr: 4.05e-02, grad_scale: 4.0 2023-09-28 11:25:10,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:25:12,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 11:25:12,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:12,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:14,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:25:14,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:14,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=2666.6666666666665, ans=0.04 2023-09-28 11:25:15,789 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 9.874e+01 1.367e+02 1.651e+02 2.389e+02 7.473e+02, threshold=3.302e+02, percent-clipped=14.0 2023-09-28 11:25:17,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:20,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 11:25:23,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 11:25:23,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:25,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 11:25:25,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:29,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:25:29,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:29,860 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=54.67 vs. limit=8.525 2023-09-28 11:25:30,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 11:25:31,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:25:31,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:31,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:33,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:35,703 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 11:25:37,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 11:25:41,593 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.94 vs. limit=9.55 2023-09-28 11:25:41,775 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.21 vs. limit=9.55 2023-09-28 11:25:42,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:44,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=2733.3333333333335, ans=0.04145833333333333 2023-09-28 11:25:45,029 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.19 vs. limit=5.683333333333334 2023-09-28 11:25:45,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 11:25:46,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 11:25:49,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:25:51,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:25:57,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 11:25:59,834 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=39.08 vs. limit=8.55 2023-09-28 11:26:02,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:26:04,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 11:26:08,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:26:09,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:26:09,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 11:26:10,622 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.00 vs. limit=9.65 2023-09-28 11:26:12,803 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=25.87 vs. limit=8.575 2023-09-28 11:26:15,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:26:17,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:26:19,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:26:22,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:22,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 11:26:24,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:26:25,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.91 vs. limit=9.7 2023-09-28 11:26:27,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 11:26:29,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:26:29,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:26:33,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 11:26:35,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:26:37,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:26:37,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:26:40,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 11:26:40,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:26:40,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:26:42,299 INFO [train.py:1039] (2/4) Epoch 1, batch 450, loss[loss=0.9361, simple_loss=0.7608, pruned_loss=0.8617, over 24038.00 frames. ], tot_loss[loss=1.005, simple_loss=0.8504, pruned_loss=0.9533, over 4232096.29 frames. ], batch size: 80, lr: 4.28e-02, grad_scale: 4.0 2023-09-28 11:26:42,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:26:42,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 11:26:42,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:26:44,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:26:48,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:26:52,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=3000.0, ans=0.0875 2023-09-28 11:26:57,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:59,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:01,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 11:27:03,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 11:27:04,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=3066.6666666666665, ans=0.08499999999999999 2023-09-28 11:27:08,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:27:11,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:13,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:19,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:19,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:21,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 11:27:22,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 11:27:24,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 11:27:26,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:27:28,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:28,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:27:31,081 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 11:27:32,566 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 11:27:32,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:34,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:27:36,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:27:38,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=19.18 vs. limit=8.7 2023-09-28 11:27:39,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:27:39,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:27:41,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:27:41,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 11:27:44,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:44,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=3200.0, ans=0.248 2023-09-28 11:27:46,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:27:47,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.94 vs. limit=8.7 2023-09-28 11:27:48,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:27:49,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 11:27:49,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=3200.0, ans=0.35 2023-09-28 11:27:50,257 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.06 vs. limit=5.8 2023-09-28 11:27:52,766 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.39 vs. limit=9.9 2023-09-28 11:27:53,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:27:56,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 11:27:57,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 11:27:59,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:28:04,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:28:05,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:09,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:28:09,121 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 11:28:12,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:12,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=3333.3333333333335, ans=0.024999999999999994 2023-09-28 11:28:13,258 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.35 vs. limit=8.75 2023-09-28 11:28:13,985 INFO [train.py:1039] (2/4) Epoch 1, batch 500, loss[loss=0.9696, simple_loss=0.7846, pruned_loss=0.8704, over 24677.00 frames. ], tot_loss[loss=0.9735, simple_loss=0.8163, pruned_loss=0.9132, over 4359840.09 frames. ], batch size: 73, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:28:14,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:28:14,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:15,753 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 11:28:15,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 11:28:15,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:19,322 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 9.903e+01 1.529e+02 1.913e+02 2.430e+02 4.167e+02, threshold=3.825e+02, percent-clipped=6.0 2023-09-28 11:28:19,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:28:24,985 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=19.91 vs. limit=6.666666666666667 2023-09-28 11:28:26,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:28:27,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:28:31,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:31,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:33,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:28:35,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=3400.0, ans=0.340625 2023-09-28 11:28:41,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=24.74 vs. limit=8.775 2023-09-28 11:28:42,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=3400.0, ans=0.023499999999999993 2023-09-28 11:28:47,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:47,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:28:47,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:28:47,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:47,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 11:28:48,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:28:52,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:28:52,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=3466.6666666666665, ans=0.07 2023-09-28 11:28:53,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:28:53,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:28:53,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:53,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 11:28:54,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=3466.6666666666665, ans=0.07 2023-09-28 11:28:55,860 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 11:28:56,540 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.71 vs. limit=6.733333333333333 2023-09-28 11:28:59,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:28:59,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:01,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:29:04,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 11:29:08,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:29:09,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.34 vs. limit=10.15 2023-09-28 11:29:10,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:14,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:19,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:26,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:28,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=3600.0, ans=0.33125 2023-09-28 11:29:30,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 11:29:30,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:31,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:35,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 11:29:35,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:29:36,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:43,549 INFO [train.py:1039] (2/4) Epoch 1, batch 550, loss[loss=0.8034, simple_loss=0.6541, pruned_loss=0.6881, over 23797.00 frames. ], tot_loss[loss=0.95, simple_loss=0.791, pruned_loss=0.8787, over 4436614.38 frames. ], batch size: 164, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:29:43,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 11:29:45,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 11:29:45,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:45,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 11:29:47,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:29:47,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:49,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:29:51,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:29:53,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:56,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 11:29:56,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:30:00,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:00,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:04,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:05,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:12,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 11:30:12,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 11:30:12,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=3733.3333333333335, ans=0.325 2023-09-28 11:30:14,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:30:19,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:30:21,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:22,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:30:27,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:27,069 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 11:30:27,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:28,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:30:32,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:35,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:30:35,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:30:37,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:38,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 11:30:40,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 11:30:40,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:40,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:41,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=27.43 vs. limit=10.4 2023-09-28 11:30:42,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:30:42,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:30:45,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:30:45,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:30:46,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=10.4 2023-09-28 11:30:48,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:30:50,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:51,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:30:51,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:30:53,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:55,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:30:55,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:55,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=23.59 vs. limit=8.975 2023-09-28 11:30:56,163 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.79 vs. limit=8.975 2023-09-28 11:30:57,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:30:59,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:31:08,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 11:31:12,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 11:31:13,687 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.00 vs. limit=9.0 2023-09-28 11:31:14,892 INFO [train.py:1039] (2/4) Epoch 1, batch 600, loss[loss=0.8359, simple_loss=0.6759, pruned_loss=0.7061, over 23428.00 frames. ], tot_loss[loss=0.9243, simple_loss=0.765, pruned_loss=0.8415, over 4484748.30 frames. ], batch size: 93, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:31:14,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:31:15,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:31:15,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:31:21,655 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.125e+02 1.678e+02 2.306e+02 3.262e+02 8.742e+02, threshold=4.612e+02, percent-clipped=14.0 2023-09-28 11:31:23,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:31:25,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:31:26,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 11:31:28,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:31:29,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=4000.0, ans=0.76 2023-09-28 11:31:30,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:31:32,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:35,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 11:31:37,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:31:40,332 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=10.55 2023-09-28 11:31:44,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 11:31:49,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:31:49,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:49,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:31:56,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:31:56,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:31:57,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:06,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:32:08,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=4200.0, ans=0.009956521739130435 2023-09-28 11:32:10,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=4200.0, ans=0.258 2023-09-28 11:32:11,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:11,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:32:11,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:32:17,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 11:32:17,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=4200.0, ans=0.753 2023-09-28 11:32:21,282 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.99 vs. limit=9.075 2023-09-28 11:32:24,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:32:24,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:32:26,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=4266.666666666667, ans=0.07333333333333333 2023-09-28 11:32:29,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 11:32:29,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:32:33,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 11:32:33,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:32:33,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:32:33,858 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.82 vs. limit=10.7 2023-09-28 11:32:42,887 INFO [train.py:1039] (2/4) Epoch 1, batch 650, loss[loss=0.7785, simple_loss=0.6391, pruned_loss=0.6195, over 22368.00 frames. ], tot_loss[loss=0.8962, simple_loss=0.7395, pruned_loss=0.7991, over 4516277.68 frames. ], batch size: 49, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:32:42,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:32:45,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:32:48,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:32:48,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:32:52,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:32:55,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 11:32:56,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:57,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.15 vs. limit=9.125 2023-09-28 11:33:03,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:33:03,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:06,597 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=16.99 vs. limit=9.15 2023-09-28 11:33:09,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 11:33:10,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:10,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:11,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.16 vs. limit=3.66 2023-09-28 11:33:15,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:15,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:33:18,854 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.24 vs. limit=9.175 2023-09-28 11:33:19,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:19,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:19,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:33:20,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.37 vs. limit=10.85 2023-09-28 11:33:21,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:23,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:33:26,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:33:26,443 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 11:33:26,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:26,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:31,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:33,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:33,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:33,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:33:33,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=4466.666666666667, ans=0.04805555555555556 2023-09-28 11:33:35,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 11:33:37,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:33:37,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:33:39,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:33:39,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:40,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:33:41,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=4533.333333333333, ans=0.2875 2023-09-28 11:33:42,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 11:33:42,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 11:33:42,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:42,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:42,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:33:44,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:46,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:46,832 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.78 vs. limit=10.9 2023-09-28 11:33:53,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:53,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:53,648 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=1.348e+01 2023-09-28 11:33:53,983 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.83 vs. limit=7.3 2023-09-28 11:33:54,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:58,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:59,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:33:59,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:34:05,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:34:05,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:07,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:07,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:10,819 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.76 vs. limit=9.225 2023-09-28 11:34:13,627 INFO [train.py:1039] (2/4) Epoch 1, batch 700, loss[loss=0.5741, simple_loss=0.4696, pruned_loss=0.4496, over 19212.00 frames. ], tot_loss[loss=0.8676, simple_loss=0.7153, pruned_loss=0.7546, over 4551427.53 frames. ], batch size: 388, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:34:15,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 11:34:16,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 11:34:20,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 11:34:20,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:21,818 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.160e+02 1.725e+02 2.743e+02 3.715e+02 1.987e+03, threshold=5.486e+02, percent-clipped=15.0 2023-09-28 11:34:22,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:34:24,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.92 vs. limit=9.25 2023-09-28 11:34:25,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 11:34:30,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:33,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:34:35,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:35,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:34:37,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:34:40,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:44,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:34:44,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:34:45,444 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=5.8933333333333335 2023-09-28 11:34:46,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 11:34:48,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=4800.0, ans=0.00982608695652174 2023-09-28 11:34:51,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 11:34:55,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:34:57,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:34:58,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=9.50 vs. limit=9.3 2023-09-28 11:34:58,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:35:02,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:35:04,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 11:35:09,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:11,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:35:11,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 11:35:14,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:35:16,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:20,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:35:27,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:35:28,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 11:35:31,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 11:35:31,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 11:35:33,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:34,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:36,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:35:39,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:39,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 11:35:41,516 INFO [train.py:1039] (2/4) Epoch 1, batch 750, loss[loss=0.6733, simple_loss=0.5483, pruned_loss=0.5205, over 19136.00 frames. ], tot_loss[loss=0.8382, simple_loss=0.692, pruned_loss=0.7092, over 4583919.73 frames. ], batch size: 389, lr: 4.49e-02, grad_scale: 4.0 2023-09-28 11:35:44,178 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.40 vs. limit=9.375 2023-09-28 11:35:44,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 11:35:44,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 11:35:44,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 11:35:46,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 11:35:46,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 11:35:46,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:35:48,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 11:35:49,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:49,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:35:51,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:35:53,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:53,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:35:54,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:55,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.35 vs. limit=9.375 2023-09-28 11:35:56,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:35:58,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:36:04,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:36:06,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:07,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:07,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 11:36:07,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=5066.666666666667, ans=0.2625 2023-09-28 11:36:09,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:36:11,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:12,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:14,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:36:16,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 11:36:16,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:36:19,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 11:36:19,680 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 11:36:19,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 11:36:19,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:36:19,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:36:22,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:36:28,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=5133.333333333333, ans=0.259375 2023-09-28 11:36:31,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:36:31,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:31,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:36:32,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:35,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:36:36,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 11:36:38,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:36:38,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:36:40,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:36:42,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:36:42,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 11:36:44,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:47,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=5200.0, ans=0.035 2023-09-28 11:36:50,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:36:52,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:36:52,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:55,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:37:00,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 11:37:00,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:02,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:07,754 INFO [train.py:1039] (2/4) Epoch 1, batch 800, loss[loss=0.7177, simple_loss=0.6062, pruned_loss=0.5057, over 23754.00 frames. ], tot_loss[loss=0.8056, simple_loss=0.6677, pruned_loss=0.6609, over 4625046.98 frames. ], batch size: 85, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:37:07,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:10,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:37:16,636 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 4.125e+02 6.476e+02 9.801e+02 2.445e+03, threshold=1.295e+03, percent-clipped=55.0 2023-09-28 11:37:19,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:19,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:21,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:21,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:23,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:23,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:26,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:27,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=5400.0, ans=0.246875 2023-09-28 11:37:30,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:30,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:37:33,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 11:37:35,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:35,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:35,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:37:36,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:36,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 11:37:36,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:37,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.16 vs. limit=6.35 2023-09-28 11:37:38,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 11:37:40,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:42,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=5466.666666666667, ans=0.04388888888888889 2023-09-28 11:37:44,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:44,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=5466.666666666667, ans=0.009681159420289855 2023-09-28 11:37:47,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:47,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:49,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=5466.666666666667, ans=0.24375000000000002 2023-09-28 11:37:49,634 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.69 vs. limit=9.55 2023-09-28 11:37:50,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:52,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:56,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:37:57,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:37:57,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:38:01,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 11:38:01,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 11:38:01,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:38:01,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:03,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:03,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:09,839 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 11:38:09,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 11:38:11,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:38:13,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:38:18,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:38:21,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:38:21,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=5600.0, ans=0.28400000000000003 2023-09-28 11:38:22,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.87 vs. limit=6.4 2023-09-28 11:38:23,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 11:38:23,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:38:27,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 11:38:34,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:36,584 INFO [train.py:1039] (2/4) Epoch 1, batch 850, loss[loss=0.6042, simple_loss=0.5258, pruned_loss=0.394, over 24308.00 frames. ], tot_loss[loss=0.7742, simple_loss=0.645, pruned_loss=0.6152, over 4640127.67 frames. ], batch size: 56, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:38:37,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.92 vs. limit=11.75 2023-09-28 11:38:38,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:38:40,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 11:38:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:38:40,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:40,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 11:38:40,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:43,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:38:45,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:45,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:38:46,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:48,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 11:38:49,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.27 vs. limit=7.833333333333334 2023-09-28 11:38:49,101 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.75 vs. limit=9.625 2023-09-28 11:38:50,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 11:38:50,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 11:38:51,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:51,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:38:53,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:53,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:53,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=5733.333333333333, ans=0.24266666666666667 2023-09-28 11:38:54,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:38:55,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=5733.333333333333, ans=0.09899494936611666 2023-09-28 11:39:01,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:02,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:03,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 11:39:06,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 11:39:10,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:10,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 11:39:16,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 11:39:16,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 11:39:19,947 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 11:39:19,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:19,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:39:19,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:39:23,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:24,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:25,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 11:39:26,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:28,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:29,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:39:31,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:39:33,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:39:35,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:39:35,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 11:39:42,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:39:42,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:39:42,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:39:42,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:39:42,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=5866.666666666667, ans=0.22499999999999998 2023-09-28 11:39:44,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:45,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:47,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:39:49,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:39:49,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:39:51,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:39:53,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=5933.333333333333, ans=0.221875 2023-09-28 11:40:00,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:40:01,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:40:03,142 INFO [train.py:1039] (2/4) Epoch 1, batch 900, loss[loss=0.6784, simple_loss=0.5729, pruned_loss=0.4642, over 22745.00 frames. ], tot_loss[loss=0.7417, simple_loss=0.6216, pruned_loss=0.5712, over 4656285.12 frames. ], batch size: 322, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:40:03,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 11:40:03,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:03,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:40:06,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 11:40:10,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:40:12,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:13,966 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 3.628e+02 6.882e+02 1.109e+03 2.718e+03, threshold=1.376e+03, percent-clipped=19.0 2023-09-28 11:40:14,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 11:40:17,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:40:18,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 11:40:18,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:40:19,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:19,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:21,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:40:21,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:40:33,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=6066.666666666667, ans=0.04138888888888889 2023-09-28 11:40:36,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:40:36,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:36,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:40:38,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:40,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=6133.333333333333, ans=0.21250000000000002 2023-09-28 11:40:43,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 11:40:46,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:40:52,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:40:54,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:40:54,132 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 11:40:55,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 11:40:56,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=6200.0, ans=0.683 2023-09-28 11:41:01,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:41:02,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:41:02,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:41:09,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:09,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:41:11,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 11:41:13,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:41:13,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 11:41:16,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:41:16,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:17,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:41:17,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:21,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 11:41:23,508 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 11:41:26,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:41:26,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 11:41:27,057 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.014e+01 2023-09-28 11:41:29,579 INFO [train.py:1039] (2/4) Epoch 1, batch 950, loss[loss=0.5563, simple_loss=0.4784, pruned_loss=0.363, over 23590.00 frames. ], tot_loss[loss=0.7127, simple_loss=0.601, pruned_loss=0.5325, over 4653327.91 frames. ], batch size: 256, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:41:29,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:33,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 11:41:38,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:42,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:42,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:43,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:41:46,835 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 11:41:51,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:53,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:41:53,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:53,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:41:53,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 11:41:55,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:41:56,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:56,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 11:41:59,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:02,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=6466.666666666667, ans=0.19687500000000002 2023-09-28 11:42:04,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:04,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:04,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:42:05,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 11:42:07,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:42:11,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:42:12,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:42:17,620 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.09 vs. limit=9.925 2023-09-28 11:42:18,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:42:18,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:42:21,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 11:42:23,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:42:23,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:42:25,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:25,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:25,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:42:30,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 11:42:32,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:42:33,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:35,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:35,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 11:42:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:35,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:42:36,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 11:42:37,777 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.58 vs. limit=9.975 2023-09-28 11:42:40,103 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=7.04 vs. limit=6.640000000000001 2023-09-28 11:42:42,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:42:43,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.28 vs. limit=6.640000000000001 2023-09-28 11:42:46,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:51,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:42:53,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 11:42:53,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 11:42:56,624 INFO [train.py:1039] (2/4) Epoch 1, batch 1000, loss[loss=0.5754, simple_loss=0.5097, pruned_loss=0.3511, over 23278.00 frames. ], tot_loss[loss=0.6807, simple_loss=0.5788, pruned_loss=0.4922, over 4679957.04 frames. ], batch size: 93, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:42:58,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:58,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=6666.666666666667, ans=0.0 2023-09-28 11:43:01,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 11:43:03,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:06,718 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.970e+02 4.014e+02 6.511e+02 1.253e+03 2.271e+03, threshold=1.302e+03, percent-clipped=16.0 2023-09-28 11:43:07,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=6666.666666666667, ans=0.1875 2023-09-28 11:43:08,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:43:10,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 11:43:10,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 11:43:15,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:15,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:43:15,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:19,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 11:43:25,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 11:43:27,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 11:43:27,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:28,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 11:43:32,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:43:32,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 11:43:32,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:33,237 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=19.81 vs. limit=12.6 2023-09-28 11:43:34,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:37,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=6800.0, ans=0.662 2023-09-28 11:43:43,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:44,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:43:45,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:45,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:45,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 11:43:47,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:47,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:43:49,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:49,378 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 11:43:52,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=17.27 vs. limit=10.075 2023-09-28 11:43:52,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 11:43:54,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 11:43:55,454 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.70 vs. limit=6.716666666666667 2023-09-28 11:43:56,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 11:43:57,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:44:03,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:04,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:44:04,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:05,343 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.32 vs. limit=12.7 2023-09-28 11:44:06,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:44:09,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 11:44:09,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:44:09,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=6933.333333333333, ans=0.037777777777777785 2023-09-28 11:44:11,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 11:44:11,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 11:44:11,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=6933.333333333333, ans=0.175 2023-09-28 11:44:12,043 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.51 vs. limit=6.733333333333333 2023-09-28 11:44:12,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:12,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:44:15,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:44:16,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:44:20,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:44:21,984 INFO [train.py:1039] (2/4) Epoch 1, batch 1050, loss[loss=0.6057, simple_loss=0.5384, pruned_loss=0.3643, over 24657.00 frames. ], tot_loss[loss=0.6531, simple_loss=0.5591, pruned_loss=0.4587, over 4684711.32 frames. ], batch size: 73, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:44:22,826 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.92 vs. limit=10.125 2023-09-28 11:44:25,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:44:27,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:44:28,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:44:29,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=7000.0, ans=0.1 2023-09-28 11:44:29,481 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.00 vs. limit=8.5 2023-09-28 11:44:30,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:33,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:44:35,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:44:36,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:44:37,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=7066.666666666667, ans=0.16875 2023-09-28 11:44:40,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:44:40,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:44:40,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:44:42,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:44:42,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 11:44:43,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:44:43,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 11:44:47,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:47,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 11:44:47,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:44:56,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:57,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.44 vs. limit=12.85 2023-09-28 11:44:58,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:44:58,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:45:01,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 11:45:01,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 11:45:01,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:45:05,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 11:45:08,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 11:45:09,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:10,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=7133.333333333333, ans=0.16562500000000002 2023-09-28 11:45:12,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:45:14,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:45:14,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:45:15,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=7200.0, ans=0.16249999999999998 2023-09-28 11:45:16,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:45:19,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:45:22,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 11:45:23,608 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.85 vs. limit=10.2 2023-09-28 11:45:25,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 11:45:25,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 11:45:26,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:26,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:45:28,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 11:45:28,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=7266.666666666667, ans=0.22733333333333333 2023-09-28 11:45:33,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:45:35,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:35,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:45:36,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:36,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:40,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:40,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 11:45:43,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:43,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 11:45:43,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 11:45:45,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:45:46,848 INFO [train.py:1039] (2/4) Epoch 1, batch 1100, loss[loss=0.5561, simple_loss=0.4925, pruned_loss=0.3345, over 23649.00 frames. ], tot_loss[loss=0.6292, simple_loss=0.5428, pruned_loss=0.4293, over 4678362.27 frames. ], batch size: 149, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:45:48,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:45:55,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:45:58,774 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 4.555e+02 7.978e+02 1.389e+03 3.645e+03, threshold=1.596e+03, percent-clipped=29.0 2023-09-28 11:46:00,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:46:00,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:46:00,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:02,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 11:46:04,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:46:07,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:46:08,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:46:10,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:46:10,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 11:46:12,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:46:13,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:13,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:46:14,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.55 vs. limit=13.05 2023-09-28 11:46:17,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:46:20,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:46:22,618 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.01 vs. limit=8.733333333333334 2023-09-28 11:46:23,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:46:27,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 11:46:28,901 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 11:46:29,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:29,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=7466.666666666667, ans=0.15000000000000002 2023-09-28 11:46:32,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:32,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:46:34,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:46:34,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 11:46:34,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:46:34,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:46:36,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:46:36,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:37,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 11:46:39,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=7533.333333333333, ans=0.03527777777777778 2023-09-28 11:46:42,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:46:43,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 11:46:45,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:46:51,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:46:54,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 11:46:54,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:46:56,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:58,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=7600.0, ans=0.14375 2023-09-28 11:46:59,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:59,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:00,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 11:47:03,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:47:04,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:04,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=7600.0, ans=0.634 2023-09-28 11:47:06,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 11:47:06,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:47:08,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 11:47:09,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:47:09,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:47:09,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:47:12,124 INFO [train.py:1039] (2/4) Epoch 1, batch 1150, loss[loss=0.5455, simple_loss=0.5031, pruned_loss=0.3028, over 24366.00 frames. ], tot_loss[loss=0.6088, simple_loss=0.5294, pruned_loss=0.4036, over 4692158.57 frames. ], batch size: 74, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:47:15,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:18,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:47:20,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:47:21,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:47:21,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 11:47:21,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:23,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=7666.666666666667, ans=0.034722222222222224 2023-09-28 11:47:25,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 11:47:26,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:26,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:47:28,882 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.53 vs. limit=8.866666666666667 2023-09-28 11:47:32,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 11:47:32,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=7733.333333333333, ans=0.034444444444444444 2023-09-28 11:47:32,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=7733.333333333333, ans=0.1375 2023-09-28 11:47:35,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=3.82 vs. limit=7.093333333333334 2023-09-28 11:47:35,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:39,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:41,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:47:41,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 11:47:43,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:47:43,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:43,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=7800.0, ans=0.034166666666666665 2023-09-28 11:47:43,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=7800.0, ans=0.034166666666666665 2023-09-28 11:47:46,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 11:47:48,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:51,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:48:00,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:00,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=7866.666666666667, ans=0.6246666666666667 2023-09-28 11:48:02,465 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=5.28 vs. limit=10.45 2023-09-28 11:48:03,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=7866.666666666667, ans=0.0 2023-09-28 11:48:07,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:07,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 11:48:09,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:09,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:16,058 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 11:48:18,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:18,892 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.78 vs. limit=13.45 2023-09-28 11:48:26,723 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 11:48:29,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:32,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:48:32,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:48:33,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:48:33,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.09 vs. limit=10.5 2023-09-28 11:48:34,809 INFO [train.py:1039] (2/4) Epoch 1, batch 1200, loss[loss=0.5204, simple_loss=0.4805, pruned_loss=0.2875, over 24538.00 frames. ], tot_loss[loss=0.5884, simple_loss=0.5158, pruned_loss=0.3798, over 4709132.53 frames. ], batch size: 71, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:48:37,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:41,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:48:41,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:48:42,663 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.08 vs. limit=4.2 2023-09-28 11:48:43,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:48:43,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:44,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:48:46,294 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 4.760e+02 7.806e+02 1.164e+03 2.947e+03, threshold=1.561e+03, percent-clipped=14.0 2023-09-28 11:48:46,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:48:47,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:48:50,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:50,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:51,837 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 11:48:54,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 11:49:00,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:49:01,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:49:03,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:06,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:06,747 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 11:49:08,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:12,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=8133.333333333333, ans=0.125 2023-09-28 11:49:18,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:49:18,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:49:18,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 11:49:19,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:49:21,820 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 11:49:23,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 11:49:24,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 11:49:26,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:28,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:49:28,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:30,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:49:32,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:32,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:49:34,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:49:34,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 11:49:35,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:49:35,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:37,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:49:38,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:49:38,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:44,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:49:44,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=10.6 2023-09-28 11:49:45,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:49:48,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 11:49:53,400 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 11:49:55,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:58,117 INFO [train.py:1039] (2/4) Epoch 1, batch 1250, loss[loss=0.4423, simple_loss=0.413, pruned_loss=0.239, over 24414.00 frames. ], tot_loss[loss=0.5736, simple_loss=0.5056, pruned_loss=0.3622, over 4717727.59 frames. ], batch size: 58, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:49:58,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:59,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:50:01,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:50:04,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 11:50:08,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:50:09,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:09,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 11:50:11,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:50:12,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:50:15,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:50:18,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:19,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:50:19,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:21,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:50:26,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:50:26,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:50:26,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:50:27,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:29,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:30,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:32,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:50:38,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 11:50:38,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:50:41,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:50:41,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 11:50:41,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:42,832 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 11:50:42,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:42,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:48,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:51,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=8533.333333333334, ans=0.6013333333333334 2023-09-28 11:50:52,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:52,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:50:54,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 11:50:54,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 11:50:55,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 11:50:58,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:00,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 11:51:00,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:00,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=8533.333333333334, ans=0.12106666666666667 2023-09-28 11:51:04,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:51:04,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:51:07,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 11:51:07,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:51:07,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:51:10,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:51:10,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:10,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=8600.0, ans=0.329 2023-09-28 11:51:12,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 11:51:13,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:17,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:51:18,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:51:20,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:51:21,993 INFO [train.py:1039] (2/4) Epoch 1, batch 1300, loss[loss=0.506, simple_loss=0.4563, pruned_loss=0.2891, over 23306.00 frames. ], tot_loss[loss=0.5586, simple_loss=0.4961, pruned_loss=0.3446, over 4715675.68 frames. ], batch size: 93, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:51:23,028 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.05 vs. limit=9.333333333333332 2023-09-28 11:51:23,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:23,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 11:51:30,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:31,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:51:32,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:51:34,988 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 3.707e+02 6.388e+02 1.142e+03 3.121e+03, threshold=1.278e+03, percent-clipped=13.0 2023-09-28 11:51:35,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:38,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:51:38,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 11:51:43,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:51:45,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:51:47,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 11:51:50,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:51:55,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:51:56,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:57,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:58,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:01,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:52:01,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:52:01,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 11:52:02,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=8800.0, ans=0.008956521739130436 2023-09-28 11:52:04,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=8800.0, ans=0.125 2023-09-28 11:52:08,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:52:08,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:52:10,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 11:52:10,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:52:11,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:52:14,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:52:15,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 11:52:15,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=8866.666666666666, ans=0.5896666666666668 2023-09-28 11:52:16,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:16,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 11:52:20,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:21,563 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.63 vs. limit=10.825 2023-09-28 11:52:22,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:52:22,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:52:25,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 11:52:27,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 11:52:29,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 11:52:32,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:52:36,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 11:52:39,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:45,142 INFO [train.py:1039] (2/4) Epoch 1, batch 1350, loss[loss=0.5129, simple_loss=0.477, pruned_loss=0.2783, over 24082.00 frames. ], tot_loss[loss=0.5441, simple_loss=0.4862, pruned_loss=0.3291, over 4724511.69 frames. ], batch size: 80, lr: 4.46e-02, grad_scale: 4.0 2023-09-28 11:52:46,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 11:52:48,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=9000.0, ans=0.02916666666666667 2023-09-28 11:52:48,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=9000.0, ans=0.335 2023-09-28 11:52:49,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:51,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:52:55,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:55,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:56,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:52:58,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:03,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:05,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 11:53:05,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:07,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:53:10,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 11:53:11,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:53:11,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=9066.666666666666, ans=0.125 2023-09-28 11:53:13,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:53:13,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 11:53:16,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 11:53:17,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 11:53:19,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:19,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 11:53:30,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:39,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:39,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:41,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 11:53:42,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:45,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 11:53:45,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:46,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:53:49,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:53:50,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 11:53:53,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:53:58,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=9266.666666666666, ans=0.025 2023-09-28 11:53:58,959 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.57 vs. limit=7.316666666666666 2023-09-28 11:54:00,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 11:54:02,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 11:54:09,173 INFO [train.py:1039] (2/4) Epoch 1, batch 1400, loss[loss=0.4665, simple_loss=0.4339, pruned_loss=0.2527, over 23354.00 frames. ], tot_loss[loss=0.5274, simple_loss=0.4743, pruned_loss=0.3133, over 4700427.41 frames. ], batch size: 105, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:54:09,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 11:54:11,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:54:16,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:54:16,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:54:21,743 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.59 vs. limit=11.0 2023-09-28 11:54:22,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 11:54:23,716 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 3.566e+02 5.835e+02 9.354e+02 4.572e+03, threshold=1.167e+03, percent-clipped=13.0 2023-09-28 11:54:23,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 11:54:32,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=9400.0, ans=0.027500000000000004 2023-09-28 11:54:33,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:54:35,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:54:36,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:54:36,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:54:40,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:54:42,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:54:46,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=9466.666666666666, ans=0.0 2023-09-28 11:54:49,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=9466.666666666666, ans=0.125 2023-09-28 11:54:52,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:52,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:53,120 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.52 vs. limit=11.05 2023-09-28 11:54:57,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 11:54:58,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:54:58,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:54:59,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=9533.333333333334, ans=0.125 2023-09-28 11:55:00,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:55:00,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:02,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:55:02,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:55:02,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:55:02,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=9533.333333333334, ans=7.383333333333334 2023-09-28 11:55:05,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 11:55:05,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:55:09,193 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.14 vs. limit=7.383333333333334 2023-09-28 11:55:10,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:14,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:55:17,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=9600.0, ans=0.02666666666666667 2023-09-28 11:55:23,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.10 vs. limit=7.4 2023-09-28 11:55:23,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 11:55:25,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:55:25,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:55:28,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:55:30,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:31,861 INFO [train.py:1039] (2/4) Epoch 1, batch 1450, loss[loss=0.4228, simple_loss=0.4095, pruned_loss=0.2149, over 24470.00 frames. ], tot_loss[loss=0.514, simple_loss=0.4649, pruned_loss=0.3004, over 4696128.98 frames. ], batch size: 58, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:55:31,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:55:35,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:55:36,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:55:36,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:36,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:55:42,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:44,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:55:44,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:44,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 11:55:46,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:55:48,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 11:55:50,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:50,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:50,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 11:55:52,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:55:54,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:55:56,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:55:56,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:57,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:55:59,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:00,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:04,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:56:04,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:56:07,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:56:07,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:08,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:08,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:56:10,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:10,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:13,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 11:56:18,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:56:21,166 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 11:56:23,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:25,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:56:27,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:29,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 11:56:33,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:35,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 11:56:35,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=9866.666666666666, ans=0.008724637681159421 2023-09-28 11:56:36,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 11:56:38,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:41,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:56:41,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:43,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 11:56:45,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 11:56:45,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 11:56:46,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:48,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:56:54,947 INFO [train.py:1039] (2/4) Epoch 1, batch 1500, loss[loss=0.4756, simple_loss=0.4355, pruned_loss=0.2624, over 23669.00 frames. ], tot_loss[loss=0.5046, simple_loss=0.4593, pruned_loss=0.2901, over 4692259.66 frames. ], batch size: 256, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:56:59,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 11:56:59,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:56:59,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:57:00,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:02,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:02,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:57:04,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 11:57:06,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:57:06,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:57:06,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:07,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:57:08,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.22 vs. limit=4.5 2023-09-28 11:57:10,819 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.048e+02 3.499e+02 5.909e+02 9.288e+02 2.563e+03, threshold=1.182e+03, percent-clipped=18.0 2023-09-28 11:57:10,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:57:11,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=10066.666666666666, ans=0.125 2023-09-28 11:57:12,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:13,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.15 vs. limit=11.275 2023-09-28 11:57:14,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=10066.666666666666, ans=0.024722222222222225 2023-09-28 11:57:18,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:18,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 11:57:18,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:57:18,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:57:20,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:23,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 11:57:26,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 11:57:28,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:28,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 11:57:29,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=10133.333333333334, ans=0.125 2023-09-28 11:57:29,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=10133.333333333334, ans=0.0 2023-09-28 11:57:32,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:57:35,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:57:37,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:37,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:57:39,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 11:57:39,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:57:39,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:41,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 11:57:42,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:46,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=10200.0, ans=0.543 2023-09-28 11:57:47,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:57:47,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 11:57:53,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:57:53,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=10200.0, ans=0.125 2023-09-28 11:57:55,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:57:59,718 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 11:58:01,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:01,200 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 11:58:01,820 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.13 vs. limit=11.35 2023-09-28 11:58:02,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:04,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:06,027 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 11:58:06,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:58:09,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 11:58:11,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:13,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.20 vs. limit=7.566666666666666 2023-09-28 11:58:16,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:16,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,405 INFO [train.py:1039] (2/4) Epoch 1, batch 1550, loss[loss=0.4547, simple_loss=0.4244, pruned_loss=0.2442, over 23862.00 frames. ], tot_loss[loss=0.4947, simple_loss=0.4534, pruned_loss=0.2799, over 4709515.07 frames. ], batch size: 195, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 11:58:18,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:18,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:58:20,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 11:58:20,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=10333.333333333334, ans=0.023611111111111107 2023-09-28 11:58:21,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 11:58:21,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:58:23,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 11:58:23,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 11:58:25,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:26,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:26,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:58:26,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:58:29,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:29,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:31,377 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 11:58:32,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:32,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:58:32,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:58:33,691 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.75 vs. limit=10.2 2023-09-28 11:58:36,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:58:36,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 11:58:36,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=10400.0, ans=0.536 2023-09-28 11:58:37,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:37,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 11:58:39,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 11:58:39,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 11:58:39,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:42,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:58:45,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=10400.0, ans=0.125 2023-09-28 11:58:47,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:49,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 11:58:49,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 11:58:58,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:02,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:59:02,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:59:02,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:59:02,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=10466.666666666666, ans=0.5336666666666667 2023-09-28 11:59:03,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 11:59:09,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:59:11,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:14,175 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.20 vs. limit=11.45 2023-09-28 11:59:15,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:59:18,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:59:18,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:20,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 11:59:20,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:20,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:59:21,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:23,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:59:23,313 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 11:59:25,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:28,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.31 vs. limit=7.65 2023-09-28 11:59:31,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 11:59:36,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:38,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:38,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=10600.0, ans=0.022500000000000003 2023-09-28 11:59:39,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 11:59:41,347 INFO [train.py:1039] (2/4) Epoch 1, batch 1600, loss[loss=0.4548, simple_loss=0.4466, pruned_loss=0.2278, over 24654.00 frames. ], tot_loss[loss=0.487, simple_loss=0.4495, pruned_loss=0.2716, over 4712566.16 frames. ], batch size: 73, lr: 4.45e-02, grad_scale: 16.0 2023-09-28 11:59:43,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:44,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:44,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:59:44,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:59:44,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:59:45,419 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.64 vs. limit=15.5 2023-09-28 11:59:49,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:50,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 11:59:51,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 11:59:55,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 11:59:56,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:59:58,015 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.152e+02 3.597e+02 5.871e+02 8.452e+02 2.438e+03, threshold=1.174e+03, percent-clipped=11.0 2023-09-28 11:59:58,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 11:59:58,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:00:02,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:00:04,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=10733.333333333334, ans=0.07 2023-09-28 12:00:05,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:00:08,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 12:00:11,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:00:13,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 12:00:13,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:14,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 12:00:18,611 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.13 vs. limit=11.55 2023-09-28 12:00:19,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 12:00:19,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=10800.0, ans=0.125 2023-09-28 12:00:28,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:28,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 12:00:28,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:30,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:00:30,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:00:30,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=10866.666666666666, ans=0.125 2023-09-28 12:00:35,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:00:40,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:00:40,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:00:41,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:41,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:43,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:00:45,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:00:45,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:00:47,387 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.93 vs. limit=15.7 2023-09-28 12:00:48,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:00:51,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.93 vs. limit=11.6 2023-09-28 12:00:55,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:55,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:00:57,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 12:00:57,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:00:59,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 12:00:59,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=10933.333333333334, ans=10.0 2023-09-28 12:01:04,475 INFO [train.py:1039] (2/4) Epoch 1, batch 1650, loss[loss=0.4893, simple_loss=0.4486, pruned_loss=0.268, over 23754.00 frames. ], tot_loss[loss=0.4815, simple_loss=0.447, pruned_loss=0.2652, over 4712826.87 frames. ], batch size: 164, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 12:01:04,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:08,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:08,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:01:08,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 12:01:08,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 12:01:08,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 12:01:08,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=11000.0, ans=0.125 2023-09-28 12:01:09,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 12:01:14,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:01:16,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:16,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:01:16,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:01:18,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:21,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 12:01:22,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:01:22,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:22,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:01:24,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:01:24,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 12:01:25,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 12:01:33,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:01:34,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:01:42,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 12:01:43,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:47,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 12:01:47,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=11133.333333333334, ans=0.5103333333333333 2023-09-28 12:01:48,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:01:50,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:01:50,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:01:52,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:01:53,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:55,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:59,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:01,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:01,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:03,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:02:03,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=11200.0, ans=0.125 2023-09-28 12:02:05,744 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.49 vs. limit=11.7 2023-09-28 12:02:06,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:06,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 12:02:08,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:08,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 12:02:08,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.85 vs. limit=10.6 2023-09-28 12:02:09,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 12:02:09,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 12:02:11,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:12,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=11266.666666666666, ans=10.0 2023-09-28 12:02:13,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:02:13,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:13,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:02:13,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 12:02:18,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:20,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:02:20,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:24,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 12:02:28,794 INFO [train.py:1039] (2/4) Epoch 1, batch 1700, loss[loss=0.4464, simple_loss=0.4132, pruned_loss=0.2413, over 23858.00 frames. ], tot_loss[loss=0.4731, simple_loss=0.4414, pruned_loss=0.258, over 4705515.93 frames. ], batch size: 195, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:02:29,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:29,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:02:29,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 12:02:30,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:30,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:02:30,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:33,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:02:33,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:02:33,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 12:02:37,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:02:45,392 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.253e+02 3.835e+02 6.904e+02 1.046e+03 2.238e+03, threshold=1.381e+03, percent-clipped=16.0 2023-09-28 12:02:45,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:49,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:02:56,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:02:57,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:02:59,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:59,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:01,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=11466.666666666666, ans=0.125 2023-09-28 12:03:02,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 12:03:05,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:03:05,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:06,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:03:08,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:03:10,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 12:03:10,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 12:03:12,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:13,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 12:03:15,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:03:24,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:24,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:24,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:03:26,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:03:26,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 12:03:27,309 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.12 vs. limit=16.15 2023-09-28 12:03:27,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:29,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:29,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 12:03:29,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=11533.333333333334, ans=0.018611111111111106 2023-09-28 12:03:31,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:03:31,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:31,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:31,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:34,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:34,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:03:36,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:36,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:03:36,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:41,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:41,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 12:03:44,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:46,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:47,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 12:03:52,710 INFO [train.py:1039] (2/4) Epoch 1, batch 1750, loss[loss=0.4272, simple_loss=0.4316, pruned_loss=0.2078, over 23809.00 frames. ], tot_loss[loss=0.4634, simple_loss=0.4355, pruned_loss=0.2498, over 4706593.50 frames. ], batch size: 85, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:03:56,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:58,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:58,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=11666.666666666666, ans=0.4916666666666667 2023-09-28 12:03:59,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:04:01,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 12:04:01,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:04:04,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:04:05,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:08,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 12:04:11,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:12,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 12:04:14,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:15,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=11733.333333333334, ans=0.008318840579710145 2023-09-28 12:04:16,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:04:19,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:04:21,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 12:04:22,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:04:22,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 12:04:30,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=11.925 2023-09-28 12:04:33,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:04:34,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:04:34,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:39,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:39,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:41,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:04:43,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:46,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:47,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:47,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 12:04:49,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:51,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 12:04:53,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:04:53,753 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.62 vs. limit=11.95 2023-09-28 12:04:54,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:56,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:05:01,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:05:01,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 12:05:03,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:06,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:05:08,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=11933.333333333334, ans=0.125 2023-09-28 12:05:11,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:13,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:15,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:05:15,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 12:05:15,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:17,318 INFO [train.py:1039] (2/4) Epoch 1, batch 1800, loss[loss=0.4282, simple_loss=0.4305, pruned_loss=0.2103, over 23900.00 frames. ], tot_loss[loss=0.4556, simple_loss=0.4314, pruned_loss=0.2428, over 4705747.40 frames. ], batch size: 86, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:05:17,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:05:17,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:17,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:05:17,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:05:17,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:05:20,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:05:22,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:24,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:05:27,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:30,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:05:32,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:05:33,418 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.239e+02 3.495e+02 5.189e+02 7.461e+02 1.869e+03, threshold=1.038e+03, percent-clipped=4.0 2023-09-28 12:05:35,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:36,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:39,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:39,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=12066.666666666666, ans=0.125 2023-09-28 12:05:41,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:05:42,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:42,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 12:05:44,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:46,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=12066.666666666666, ans=0.125 2023-09-28 12:05:48,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:52,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 12:05:54,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 12:05:54,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 12:05:54,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=16.98 vs. limit=16.6 2023-09-28 12:05:55,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:55,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:55,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:57,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:05:58,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=12133.333333333334, ans=0.47533333333333333 2023-09-28 12:06:05,676 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 12:06:07,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:06:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:10,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 12:06:10,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 12:06:10,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=12200.0, ans=0.015833333333333338 2023-09-28 12:06:11,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:06:14,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:06:14,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:06:19,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 12:06:19,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=12200.0, ans=0.125 2023-09-28 12:06:27,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:06:29,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 12:06:29,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:06:29,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:29,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:06:29,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=12266.666666666666, ans=0.008202898550724638 2023-09-28 12:06:30,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 12:06:33,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:06:34,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:06:34,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=12266.666666666666, ans=0.125 2023-09-28 12:06:37,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 12:06:37,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:39,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:40,797 INFO [train.py:1039] (2/4) Epoch 1, batch 1850, loss[loss=0.4193, simple_loss=0.4314, pruned_loss=0.201, over 24625.00 frames. ], tot_loss[loss=0.4508, simple_loss=0.4292, pruned_loss=0.2382, over 4707683.64 frames. ], batch size: 68, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:06:40,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:06:40,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:42,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:43,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:06:44,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:06:44,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:48,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:06:48,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:06:56,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:06:56,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 12:07:00,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 12:07:04,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 12:07:07,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:07,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 12:07:07,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 12:07:08,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=12400.0, ans=0.125 2023-09-28 12:07:15,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=12466.666666666666, ans=0.4636666666666667 2023-09-28 12:07:17,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:07:19,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 12:07:22,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=12466.666666666666, ans=0.008159420289855073 2023-09-28 12:07:23,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:07:23,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:23,582 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:07:29,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 12:07:29,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:29,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:07:31,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:07:34,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:07:37,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:07:40,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:07:40,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:40,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:07:40,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:43,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:43,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:07:45,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=12533.333333333334, ans=0.17466666666666666 2023-09-28 12:07:46,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 12:07:47,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:51,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:07:53,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:07:53,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 12:07:53,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 12:07:55,370 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 12:07:55,494 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 12:07:57,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:07:57,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:59,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:07:59,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:00,560 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 12:08:00,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:08:01,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:03,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:08:04,754 INFO [train.py:1039] (2/4) Epoch 1, batch 1900, loss[loss=0.4435, simple_loss=0.439, pruned_loss=0.2231, over 24035.00 frames. ], tot_loss[loss=0.4442, simple_loss=0.426, pruned_loss=0.2326, over 4711239.84 frames. ], batch size: 86, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:08:04,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:08:06,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:08:06,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 12:08:08,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:08,112 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 12:08:08,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:08:09,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:08:17,979 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 12:08:18,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 12:08:20,940 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.193e+02 3.536e+02 5.623e+02 9.146e+02 3.125e+03, threshold=1.125e+03, percent-clipped=17.0 2023-09-28 12:08:21,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:08:21,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:08:21,221 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 12:08:23,261 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 12:08:29,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 12:08:31,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:08:36,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 12:08:37,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 12:08:39,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=12800.0, ans=0.172 2023-09-28 12:08:48,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 12:08:48,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=12800.0, ans=0.172 2023-09-28 12:08:51,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 12:08:51,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:52,137 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.43 vs. limit=12.3 2023-09-28 12:08:52,827 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 12:08:52,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 12:08:52,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 12:08:54,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 12:08:54,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:08:57,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 12:09:00,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:09:05,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:05,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 12:09:08,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:09:08,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=12866.666666666666, ans=0.013055555555555563 2023-09-28 12:09:13,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 12:09:13,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:19,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=12933.333333333334, ans=0.125 2023-09-28 12:09:20,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:09:20,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:09:20,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:09:20,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:09:24,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:09:24,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:09:24,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:09:27,550 INFO [train.py:1039] (2/4) Epoch 1, batch 1950, loss[loss=0.4105, simple_loss=0.4042, pruned_loss=0.208, over 24437.00 frames. ], tot_loss[loss=0.4367, simple_loss=0.423, pruned_loss=0.226, over 4726749.72 frames. ], batch size: 58, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:09:27,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:27,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:09:30,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:09:30,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:30,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:32,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:34,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:34,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=13000.0, ans=0.125 2023-09-28 12:09:37,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:09:37,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:37,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:09:42,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 12:09:42,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:09:42,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:44,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:47,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:09:47,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:09:47,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:50,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:09:53,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:53,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:09:53,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:09:53,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:54,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=13066.666666666666, ans=0.125 2023-09-28 12:09:57,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.12 vs. limit=17.3 2023-09-28 12:09:58,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:00,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:10:00,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:00,745 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:10:01,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:10:01,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 12:10:03,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:10:03,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:10:03,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:08,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:11,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:10:13,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:10:17,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:10:17,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=13200.0, ans=0.008 2023-09-28 12:10:19,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:10:19,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 12:10:20,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:25,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:10:25,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:10:26,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:34,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:36,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:38,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:40,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:42,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:10:42,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:43,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 12:10:43,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:10:45,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:47,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 12:10:49,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=13333.333333333334, ans=0.125 2023-09-28 12:10:50,177 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=12.5 2023-09-28 12:10:51,432 INFO [train.py:1039] (2/4) Epoch 1, batch 2000, loss[loss=0.5179, simple_loss=0.4696, pruned_loss=0.2831, over 19834.00 frames. ], tot_loss[loss=0.4326, simple_loss=0.4212, pruned_loss=0.2226, over 4732247.91 frames. ], batch size: 388, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:10:51,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:10:56,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:56,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:10:57,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:57,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:11:00,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:03,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.66 vs. limit=17.5 2023-09-28 12:11:05,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 12:11:05,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:11:06,286 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.63 vs. limit=17.55 2023-09-28 12:11:06,956 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.094e+02 3.925e+02 5.056e+02 7.202e+02 2.152e+03, threshold=1.011e+03, percent-clipped=10.0 2023-09-28 12:11:08,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:11:10,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 12:11:12,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:11:12,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:11:15,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:11:15,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 12:11:16,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:17,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=13400.0, ans=0.125 2023-09-28 12:11:18,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:18,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:20,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 12:11:20,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:11:21,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=13400.0, ans=0.43100000000000005 2023-09-28 12:11:22,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 12:11:22,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:28,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:11:29,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:11:29,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:31,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:32,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:32,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 12:11:35,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 12:11:35,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:35,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:11:40,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:42,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:11:42,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:44,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:11:46,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:46,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:47,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:47,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:48,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=13533.333333333334, ans=0.07 2023-09-28 12:11:49,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:52,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:52,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=13533.333333333334, ans=0.125 2023-09-28 12:11:53,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 12:12:01,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:12:03,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:12:09,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:11,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:11,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:11,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=13600.0, ans=0.16399999999999998 2023-09-28 12:12:12,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:12:12,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:12:14,474 INFO [train.py:1039] (2/4) Epoch 1, batch 2050, loss[loss=0.4075, simple_loss=0.4306, pruned_loss=0.1922, over 24311.00 frames. ], tot_loss[loss=0.4246, simple_loss=0.4169, pruned_loss=0.2167, over 4734357.76 frames. ], batch size: 74, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:12:14,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:15,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:19,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:19,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:24,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:12:26,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:12:26,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:27,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:12:31,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 12:12:31,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:12:34,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:12:34,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:12:34,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=13733.333333333334, ans=0.125 2023-09-28 12:12:42,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:42,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:43,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 12:12:44,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=13733.333333333334, ans=0.125 2023-09-28 12:12:46,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:49,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 12:12:50,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:53,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:55,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:12:55,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:12:55,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=13800.0, ans=0.00916666666666667 2023-09-28 12:12:56,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:56,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:12:58,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:12:59,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:13:00,538 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=17.29 vs. limit=12.675 2023-09-28 12:13:03,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:03,948 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:13:05,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:13:07,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:13:09,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:13:12,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:20,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:13:20,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 12:13:26,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:28,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:13:29,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:13:32,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 12:13:34,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=14000.0, ans=0.035 2023-09-28 12:13:35,616 INFO [train.py:1039] (2/4) Epoch 1, batch 2100, loss[loss=0.4194, simple_loss=0.4067, pruned_loss=0.2161, over 23686.00 frames. ], tot_loss[loss=0.4184, simple_loss=0.4123, pruned_loss=0.2126, over 4712644.04 frames. ], batch size: 149, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:13:37,976 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 12:13:37,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:38,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:38,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:13:38,939 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=12.75 2023-09-28 12:13:39,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:39,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 12:13:41,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 12:13:43,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:47,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:13:47,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:13:49,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:49,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=14000.0, ans=0.125 2023-09-28 12:13:51,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:13:51,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 12:13:52,522 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.170e+02 3.843e+02 5.173e+02 8.078e+02 2.053e+03, threshold=1.035e+03, percent-clipped=17.0 2023-09-28 12:13:52,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:13:52,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 12:13:52,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 12:13:54,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:13:54,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:13:54,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 12:13:56,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:14:01,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 12:14:01,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:14:04,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:14:04,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:14:08,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:14:08,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 12:14:09,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:09,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 12:14:12,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 12:14:12,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:13,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 12:14:13,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 12:14:13,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 12:14:16,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:14:19,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:14:21,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:23,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:24,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:26,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=14200.0, ans=0.125 2023-09-28 12:14:27,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:27,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 12:14:27,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:27,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:27,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=14200.0, ans=0.0075 2023-09-28 12:14:29,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:29,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 12:14:31,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 12:14:32,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 12:14:37,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:14:39,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:14:39,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 12:14:46,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:48,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:14:49,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:14:49,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:14:49,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 12:14:51,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:14:51,551 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:14:52,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:52,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:14:54,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:14:54,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:56,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 12:14:58,963 INFO [train.py:1039] (2/4) Epoch 1, batch 2150, loss[loss=0.4077, simple_loss=0.4266, pruned_loss=0.1944, over 24326.00 frames. ], tot_loss[loss=0.4125, simple_loss=0.4085, pruned_loss=0.2086, over 4706877.98 frames. ], batch size: 77, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:14:59,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 12:14:59,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:02,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:02,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:15:02,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:15:03,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:15:10,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:15:10,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:11,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:13,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:15:13,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:13,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=14400.0, ans=0.125 2023-09-28 12:15:15,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:15:20,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:20,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:15:20,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:15:27,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:27,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 12:15:31,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:32,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:15:34,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:34,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:35,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:35,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:15:37,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:37,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:15:37,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:15:39,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 12:15:40,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:15:40,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:42,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:42,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:15:43,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=14466.666666666666, ans=10.0 2023-09-28 12:15:44,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:15:47,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:47,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:15:49,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:49,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 12:15:49,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:15:53,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:53,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:56,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:56,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:15:56,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:59,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:59,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 12:16:01,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 12:16:01,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:16:01,475 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 12:16:02,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:04,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:16:04,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 12:16:04,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:16:05,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 12:16:05,915 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 12:16:05,915 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 12:16:05,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 12:16:08,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:10,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:16:10,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:16:10,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:12,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:16:13,159 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=4.39 vs. limit=12.975 2023-09-28 12:16:13,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:13,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:17,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=14600.0, ans=0.125 2023-09-28 12:16:20,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:16:20,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 12:16:21,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=14666.666666666666, ans=0.07 2023-09-28 12:16:22,765 INFO [train.py:1039] (2/4) Epoch 1, batch 2200, loss[loss=0.4859, simple_loss=0.4372, pruned_loss=0.2673, over 19170.00 frames. ], tot_loss[loss=0.4072, simple_loss=0.4069, pruned_loss=0.204, over 4715190.47 frames. ], batch size: 388, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:16:23,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=14666.666666666666, ans=0.125 2023-09-28 12:16:24,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:16:31,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:32,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:16:33,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:16:33,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:16:35,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.23 vs. limit=13.0 2023-09-28 12:16:36,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:37,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:16:37,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 12:16:39,270 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 4.143e+02 6.351e+02 9.037e+02 1.826e+03, threshold=1.270e+03, percent-clipped=17.0 2023-09-28 12:16:41,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 12:16:44,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:16:50,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 12:16:52,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:54,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:16:55,455 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.88 vs. limit=18.6 2023-09-28 12:16:56,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:17:00,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:17:00,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 12:17:06,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:17:08,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:08,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 12:17:11,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:17:13,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:16,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:17:17,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:20,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 12:17:22,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:23,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 12:17:24,633 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.96 vs. limit=13.075 2023-09-28 12:17:25,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:25,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:17:25,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:27,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:17:27,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=14933.333333333334, ans=0.125 2023-09-28 12:17:28,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:28,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:28,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:30,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:17:32,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:17:33,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:17:35,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:17:37,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:17:39,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:17:41,080 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 12:17:41,784 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.77 vs. limit=18.7 2023-09-28 12:17:44,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:17:44,938 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 12:17:46,353 INFO [train.py:1039] (2/4) Epoch 1, batch 2250, loss[loss=0.33, simple_loss=0.3519, pruned_loss=0.154, over 24415.00 frames. ], tot_loss[loss=0.4052, simple_loss=0.4062, pruned_loss=0.2023, over 4713013.11 frames. ], batch size: 58, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:17:46,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:17:46,506 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 12:17:47,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:48,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:17:49,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:51,291 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 12:17:51,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:17:54,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:18:00,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:18:02,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=15066.666666666666, ans=0.14933333333333335 2023-09-28 12:18:03,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:18:05,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:07,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:07,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:18:12,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 12:18:12,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:12,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:18:14,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 12:18:14,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:18:15,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:18,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:23,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:24,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:18:26,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:18:26,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 12:18:27,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:31,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:18:32,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:34,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:36,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:18:36,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:39,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:39,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:18:45,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:18:47,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:18:52,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:18:52,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:18:53,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:18:59,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:19:02,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:19:02,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 12:19:02,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:04,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:19:04,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=15266.666666666666, ans=0.00755072463768116 2023-09-28 12:19:07,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 12:19:08,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=15333.333333333334, ans=0.3633333333333333 2023-09-28 12:19:09,224 INFO [train.py:1039] (2/4) Epoch 1, batch 2300, loss[loss=0.3031, simple_loss=0.3307, pruned_loss=0.1378, over 22418.00 frames. ], tot_loss[loss=0.4024, simple_loss=0.4051, pruned_loss=0.2, over 4720901.87 frames. ], batch size: 49, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:19:10,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:19:10,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:16,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:16,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:19:20,875 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 12:19:24,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:27,594 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.211e+02 3.558e+02 5.040e+02 6.600e+02 1.327e+03, threshold=1.008e+03, percent-clipped=3.0 2023-09-28 12:19:30,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:19:30,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:19:32,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:19:32,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:32,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 12:19:35,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:19:37,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:19:37,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:19:41,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:19:43,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:19:48,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:19:53,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:19:53,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:57,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:20:00,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:02,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:20:03,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:20:03,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:20:03,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 12:20:07,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:20:07,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:07,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=15533.333333333334, ans=0.125 2023-09-28 12:20:08,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:08,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:20:08,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:10,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:20:10,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:20:10,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 12:20:10,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:20:10,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:11,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 12:20:17,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:20:21,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:20:22,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=15600.0, ans=0.125 2023-09-28 12:20:26,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:26,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:20:28,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:20:32,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:20:32,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:20:33,936 INFO [train.py:1039] (2/4) Epoch 1, batch 2350, loss[loss=0.3604, simple_loss=0.368, pruned_loss=0.1764, over 23687.00 frames. ], tot_loss[loss=0.3996, simple_loss=0.4046, pruned_loss=0.1974, over 4722626.67 frames. ], batch size: 232, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:20:34,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:20:34,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 12:20:34,866 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.25 vs. limit=19.25 2023-09-28 12:20:39,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:20:39,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 12:20:45,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 12:20:45,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=15666.666666666666, ans=0.14333333333333334 2023-09-28 12:20:49,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:54,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:54,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:55,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:20:56,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:56,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 12:20:58,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:21:01,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 12:21:04,732 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.65 vs. limit=13.4 2023-09-28 12:21:06,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:21:09,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:21:09,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:21:12,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:21:12,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 12:21:12,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:21:15,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:21:16,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:17,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:21:21,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:21:22,428 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.52 vs. limit=19.4 2023-09-28 12:21:23,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 12:21:23,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:21:26,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:21:26,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:21:28,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 12:21:30,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:21:33,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 12:21:33,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:21:37,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 12:21:37,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=15866.666666666666, ans=0.0005555555555555522 2023-09-28 12:21:41,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 12:21:41,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:41,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 12:21:43,264 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 12:21:43,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 12:21:44,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.40 vs. limit=19.45 2023-09-28 12:21:44,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 12:21:47,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:21:53,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:21:55,467 INFO [train.py:1039] (2/4) Epoch 1, batch 2400, loss[loss=0.3606, simple_loss=0.3761, pruned_loss=0.1725, over 24305.00 frames. ], tot_loss[loss=0.3974, simple_loss=0.403, pruned_loss=0.196, over 4714967.87 frames. ], batch size: 56, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:21:55,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=16000.0, ans=0.125 2023-09-28 12:21:59,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:21:59,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:22:01,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 12:22:01,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 12:22:02,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=16000.0, ans=0.125 2023-09-28 12:22:09,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:22:09,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:13,384 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.157e+02 3.788e+02 5.121e+02 7.907e+02 1.984e+03, threshold=1.024e+03, percent-clipped=10.0 2023-09-28 12:22:13,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 12:22:13,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:22:15,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:15,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 12:22:21,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:24,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 12:22:27,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:22:32,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 12:22:37,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:22:38,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:43,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:22:45,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 12:22:45,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:22:52,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:53,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=16200.0, ans=0.125 2023-09-28 12:22:54,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:22:55,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:57,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:22:57,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:22:57,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:22:57,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:57,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:22:57,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:22:57,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=16200.0, ans=0.09899494936611666 2023-09-28 12:23:02,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:03,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:23:03,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 12:23:05,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 12:23:07,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:23:07,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:23:08,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 12:23:09,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 12:23:09,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 12:23:09,040 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 12:23:12,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 12:23:12,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:23:14,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:14,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:15,587 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 12:23:17,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:17,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:23:18,622 INFO [train.py:1039] (2/4) Epoch 1, batch 2450, loss[loss=0.34, simple_loss=0.3635, pruned_loss=0.1582, over 24332.00 frames. ], tot_loss[loss=0.3927, simple_loss=0.3999, pruned_loss=0.1928, over 4710373.79 frames. ], batch size: 56, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:23:21,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:23:21,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:25,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:25,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:23:27,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 12:23:31,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:23:33,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:35,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:23:36,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:23:36,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:23:36,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 12:23:37,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.31 vs. limit=13.2 2023-09-28 12:23:42,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:44,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:23:44,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:45,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.77 vs. limit=19.8 2023-09-28 12:23:49,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:23:51,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:51,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:51,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=16466.666666666668, ans=0.1353333333333333 2023-09-28 12:23:52,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:54,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 12:23:56,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:23:57,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=16466.666666666668, ans=0.04768333333333333 2023-09-28 12:24:04,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:06,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:24:06,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:07,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:24:07,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:09,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:24:09,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 12:24:12,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:24:15,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:24:18,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:24:18,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:22,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:24:24,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 12:24:24,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:24:26,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:24:26,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 12:24:26,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:24:27,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:24:32,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:24:33,071 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.57 vs. limit=13.725 2023-09-28 12:24:34,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:35,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:24:36,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=16600.0, ans=0.007260869565217392 2023-09-28 12:24:39,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 12:24:40,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:24:42,484 INFO [train.py:1039] (2/4) Epoch 1, batch 2500, loss[loss=0.3945, simple_loss=0.4224, pruned_loss=0.1833, over 24359.00 frames. ], tot_loss[loss=0.39, simple_loss=0.3987, pruned_loss=0.1907, over 4718455.47 frames. ], batch size: 77, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:24:47,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:24:49,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=16666.666666666668, ans=0.1333333333333333 2023-09-28 12:24:57,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:24:58,651 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.092e+02 3.311e+02 4.772e+02 6.840e+02 1.468e+03, threshold=9.543e+02, percent-clipped=7.0 2023-09-28 12:24:58,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:25:00,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:25:00,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 12:25:07,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=16733.333333333332, ans=0.125 2023-09-28 12:25:08,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:25:08,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:09,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:25:09,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:25:09,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 12:25:12,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:14,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 12:25:14,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 12:25:16,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:20,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:25:22,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:24,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:25:24,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 12:25:26,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:25:29,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:32,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:37,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:39,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:42,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=16866.666666666668, ans=0.025 2023-09-28 12:25:45,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:25:45,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=16866.666666666668, ans=0.125 2023-09-28 12:25:46,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 12:25:47,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=16866.666666666668, ans=0.125 2023-09-28 12:25:48,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:48,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:25:50,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:25:50,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:25:50,204 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 12:25:50,205 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 12:25:50,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 12:25:53,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=16933.333333333332, ans=0.125 2023-09-28 12:25:54,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:56,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 12:25:56,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 12:25:57,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:59,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 12:26:02,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 12:26:02,992 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:26:06,326 INFO [train.py:1039] (2/4) Epoch 1, batch 2550, loss[loss=0.3257, simple_loss=0.3641, pruned_loss=0.1437, over 24331.00 frames. ], tot_loss[loss=0.3886, simple_loss=0.3982, pruned_loss=0.1896, over 4719890.70 frames. ], batch size: 61, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:26:06,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:06,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:26:08,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:26:09,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:11,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 12:26:11,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:26:11,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=17000.0, ans=0.125 2023-09-28 12:26:16,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 12:26:18,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:26:20,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:21,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:26:21,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 12:26:23,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:23,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:23,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:27,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:26:27,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 12:26:27,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:26:27,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:27,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 12:26:41,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:26:42,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=17133.333333333332, ans=0.125 2023-09-28 12:26:47,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:26:47,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:47,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:49,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:26:52,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=17133.333333333332, ans=0.007144927536231884 2023-09-28 12:26:55,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:58,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:58,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:26:58,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:26:59,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:26:59,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:27:02,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:04,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:06,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=17200.0, ans=0.0 2023-09-28 12:27:09,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:27:09,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 12:27:09,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:27:09,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:11,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:27:13,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:27:14,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:21,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:27:23,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:26,829 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 12:27:30,280 INFO [train.py:1039] (2/4) Epoch 1, batch 2600, loss[loss=0.4014, simple_loss=0.4293, pruned_loss=0.1867, over 24454.00 frames. ], tot_loss[loss=0.3874, simple_loss=0.3981, pruned_loss=0.1884, over 4721265.82 frames. ], batch size: 69, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:27:31,779 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 12:27:31,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:27:31,863 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 12:27:33,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 12:27:33,423 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 12:27:35,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=17333.333333333332, ans=0.125 2023-09-28 12:27:36,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:36,547 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 12:27:38,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 12:27:39,577 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 12:27:41,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:27:44,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 12:27:45,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 12:27:47,478 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.295e+02 3.359e+02 4.665e+02 7.266e+02 2.532e+03, threshold=9.331e+02, percent-clipped=13.0 2023-09-28 12:27:47,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:27:47,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 12:27:51,272 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 12:27:51,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 12:28:01,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:01,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:01,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:01,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 12:28:03,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:28:09,681 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 12:28:14,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:15,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:15,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 12:28:17,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:17,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:17,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 12:28:21,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:28:21,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:28:25,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:29,242 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 12:28:29,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:29,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:28:36,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:36,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:28:36,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 12:28:38,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:39,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:28:41,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:28:46,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 12:28:47,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:47,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:28:52,355 INFO [train.py:1039] (2/4) Epoch 1, batch 2650, loss[loss=0.34, simple_loss=0.3807, pruned_loss=0.1496, over 24685.00 frames. ], tot_loss[loss=0.3851, simple_loss=0.3972, pruned_loss=0.1865, over 4717852.60 frames. ], batch size: 65, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:28:53,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 12:28:53,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:54,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:28:54,098 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 12:28:54,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:28:57,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:01,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:29:01,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:29:05,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:29:06,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 12:29:06,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:29:06,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:29:09,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 12:29:11,437 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 12:29:13,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=17733.333333333332, ans=0.0 2023-09-28 12:29:14,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:16,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 12:29:16,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:16,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 12:29:20,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:20,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:29:22,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:22,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:29,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 12:29:29,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 12:29:32,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=17800.0, ans=0.0 2023-09-28 12:29:34,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:29:37,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 12:29:37,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:39,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:39,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:29:41,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:41,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:41,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=17866.666666666668, ans=0.125 2023-09-28 12:29:43,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:46,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:47,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:47,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:29:50,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:29:52,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:53,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:29:53,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:55,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:55,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:29:57,395 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.56 vs. limit=5.6899999999999995 2023-09-28 12:29:59,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:59,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:29:59,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:01,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 12:30:04,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:04,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:06,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:08,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:08,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.98 vs. limit=14.225 2023-09-28 12:30:10,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:30:10,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:13,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:13,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 12:30:15,473 INFO [train.py:1039] (2/4) Epoch 1, batch 2700, loss[loss=0.4025, simple_loss=0.4054, pruned_loss=0.1998, over 23311.00 frames. ], tot_loss[loss=0.3849, simple_loss=0.3975, pruned_loss=0.1862, over 4703067.55 frames. ], batch size: 93, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:30:17,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:30:19,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:30:22,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:30:22,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:22,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:23,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:30:23,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:23,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:30:25,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:30:25,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 12:30:25,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:30:27,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:30:28,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:30:30,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:32,941 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.245e+02 3.486e+02 4.470e+02 6.707e+02 1.380e+03, threshold=8.939e+02, percent-clipped=9.0 2023-09-28 12:30:33,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:30:36,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 12:30:36,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:30:41,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:30:41,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:30:48,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:30:48,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:48,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:30:49,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:30:49,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=18133.333333333332, ans=0.1186666666666667 2023-09-28 12:30:52,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:30:55,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:55,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:30:55,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:31:00,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:00,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:31:10,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:31:10,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:31:16,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:31:16,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:22,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:22,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:24,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:31:24,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:26,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:26,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:31:27,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:31:31,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:31,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:34,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 12:31:35,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:37,358 INFO [train.py:1039] (2/4) Epoch 1, batch 2750, loss[loss=0.3368, simple_loss=0.3617, pruned_loss=0.156, over 24582.00 frames. ], tot_loss[loss=0.3819, simple_loss=0.3965, pruned_loss=0.1837, over 4718941.57 frames. ], batch size: 60, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:31:37,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:31:37,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 12:31:39,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 12:31:39,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:42,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=18333.333333333332, ans=0.2583333333333334 2023-09-28 12:31:43,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:31:43,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:45,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:45,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:31:45,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:48,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=18333.333333333332, ans=0.1166666666666667 2023-09-28 12:31:50,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:31:50,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:31:50,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:31:50,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:50,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 12:31:50,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:31:52,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:59,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 12:31:59,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=18400.0, ans=0.125 2023-09-28 12:32:02,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:32:02,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:03,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:03,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:32:05,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:32:07,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:32:07,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:07,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:10,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=18466.666666666668, ans=0.125 2023-09-28 12:32:12,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:32:12,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:32:12,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:32:13,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:15,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:32:21,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:24,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:32:24,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:30,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:30,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:32:30,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:32:34,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=18533.333333333332, ans=0.2513333333333334 2023-09-28 12:32:37,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:32:37,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:37,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 12:32:43,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:45,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 12:32:50,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:32:53,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:32:53,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 12:32:54,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:32:55,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=18600.0, ans=0.125 2023-09-28 12:32:56,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:32:57,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 12:32:57,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:32:59,446 INFO [train.py:1039] (2/4) Epoch 1, batch 2800, loss[loss=0.3885, simple_loss=0.3842, pruned_loss=0.1965, over 23935.00 frames. ], tot_loss[loss=0.3776, simple_loss=0.3929, pruned_loss=0.1811, over 4717257.24 frames. ], batch size: 195, lr: 4.36e-02, grad_scale: 32.0 2023-09-28 12:33:01,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:33:01,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:01,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:04,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 12:33:04,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:04,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:05,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:05,729 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 12:33:05,730 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 12:33:09,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.16 vs. limit=9.666666666666668 2023-09-28 12:33:10,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:11,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:33:11,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:33:17,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:33:18,626 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 3.169e+02 4.499e+02 7.440e+02 2.031e+03, threshold=8.997e+02, percent-clipped=14.0 2023-09-28 12:33:18,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 12:33:20,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:33:23,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 12:33:24,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:24,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:33:24,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:29,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:31,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:31,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:33:31,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:33:40,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:33:40,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=18800.0, ans=0.125 2023-09-28 12:33:41,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:44,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:44,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:33:46,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:50,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:33:50,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 12:33:52,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:52,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:52,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:33:53,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=18866.666666666668, ans=0.125 2023-09-28 12:33:58,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:58,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:02,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=18866.666666666668, ans=0.9386666666666666 2023-09-28 12:34:03,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:34:04,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:34:04,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:04,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:34:04,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:34:07,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:34:08,608 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=12.82 vs. limit=14.466666666666665 2023-09-28 12:34:09,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:34:09,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 12:34:09,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:10,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:34:10,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:12,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 12:34:14,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:14,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:34:14,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:34:16,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 12:34:22,218 INFO [train.py:1039] (2/4) Epoch 1, batch 2850, loss[loss=0.3886, simple_loss=0.3857, pruned_loss=0.1957, over 23621.00 frames. ], tot_loss[loss=0.3746, simple_loss=0.3913, pruned_loss=0.1789, over 4705017.33 frames. ], batch size: 256, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:34:22,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:34:22,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:34:22,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:34:24,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:29,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:34:29,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:34:29,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:34:31,243 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:34:32,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:33,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:34,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:34:34,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=19000.0, ans=0.0 2023-09-28 12:34:35,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 12:34:41,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 12:34:41,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:43,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 12:34:43,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:46,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 12:34:47,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 12:34:49,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:49,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=19066.666666666668, ans=0.0 2023-09-28 12:34:51,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=19066.666666666668, ans=0.10933333333333331 2023-09-28 12:35:00,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:03,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:03,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:35:05,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:35:05,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:35:06,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:35:08,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:35:09,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 12:35:13,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:35:14,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:14,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:16,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:17,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=19200.0, ans=0.0 2023-09-28 12:35:18,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:18,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:20,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:22,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:22,877 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.83 vs. limit=21.9 2023-09-28 12:35:25,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:35:25,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:25,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:26,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:35:33,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:35:35,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 12:35:35,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 12:35:36,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:35:36,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:38,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 12:35:38,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:35:38,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=19266.666666666668, ans=0.04949747468305833 2023-09-28 12:35:39,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:39,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:41,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:35:41,366 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 12:35:41,434 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 12:35:41,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:35:41,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:44,381 INFO [train.py:1039] (2/4) Epoch 1, batch 2900, loss[loss=0.3477, simple_loss=0.3916, pruned_loss=0.1519, over 24354.00 frames. ], tot_loss[loss=0.3729, simple_loss=0.3901, pruned_loss=0.1779, over 4708037.93 frames. ], batch size: 77, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:35:46,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:35:48,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:48,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:50,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 12:35:53,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:55,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 12:35:55,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 12:35:57,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:35:57,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:35:57,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=19333.333333333332, ans=0.04949747468305833 2023-09-28 12:36:00,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:00,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:36:03,247 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 3.297e+02 4.561e+02 6.852e+02 1.887e+03, threshold=9.123e+02, percent-clipped=12.0 2023-09-28 12:36:04,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:36:06,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:36:06,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:36:08,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 12:36:08,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:36:10,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:11,413 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=4.66 vs. limit=14.775 2023-09-28 12:36:14,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 12:36:16,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 12:36:17,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.63 vs. limit=14.8 2023-09-28 12:36:19,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:36:19,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 12:36:19,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:36:22,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:36:22,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:36:25,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=19466.666666666668, ans=0.10533333333333333 2023-09-28 12:36:26,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:26,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:31,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:36:33,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:36:33,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 12:36:35,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 12:36:35,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:36:35,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=19533.333333333332, ans=0.125 2023-09-28 12:36:38,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=19533.333333333332, ans=0.006623188405797102 2023-09-28 12:36:39,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:36:40,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.53 vs. limit=7.906666666666666 2023-09-28 12:36:43,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 12:36:44,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:36:46,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=19533.333333333332, ans=0.10466666666666669 2023-09-28 12:36:49,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:49,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=19600.0, ans=0.0 2023-09-28 12:36:51,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=19600.0, ans=0.125 2023-09-28 12:36:51,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=19600.0, ans=0.125 2023-09-28 12:36:59,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:36:59,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:37:03,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 12:37:04,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:04,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 12:37:04,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:06,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:37:08,502 INFO [train.py:1039] (2/4) Epoch 1, batch 2950, loss[loss=0.3657, simple_loss=0.4045, pruned_loss=0.1635, over 24378.00 frames. ], tot_loss[loss=0.3718, simple_loss=0.3905, pruned_loss=0.1766, over 4721849.18 frames. ], batch size: 77, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:37:13,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:14,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 12:37:16,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:16,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:18,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:37:19,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:37:20,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=19666.666666666668, ans=0.025 2023-09-28 12:37:21,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 12:37:22,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 12:37:24,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:37:24,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:24,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=19733.333333333332, ans=0.006579710144927537 2023-09-28 12:37:26,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=19733.333333333332, ans=0.125 2023-09-28 12:37:28,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=19733.333333333332, ans=0.125 2023-09-28 12:37:29,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:31,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:34,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:37:34,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:37,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:37:38,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:37:39,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:41,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:41,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:37:44,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 12:37:49,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 12:37:49,242 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 12:37:50,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:37:52,231 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 12:37:53,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 12:37:53,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:55,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:55,694 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 12:37:55,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:37:56,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=19866.666666666668, ans=0.125 2023-09-28 12:37:58,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 12:37:58,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:58,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:37:59,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=19866.666666666668, ans=0.125 2023-09-28 12:38:02,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=19866.666666666668, ans=0.125 2023-09-28 12:38:03,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:04,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:38:04,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:04,865 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 12:38:04,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:04,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 12:38:12,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:13,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:15,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 12:38:15,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:38:17,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 12:38:19,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:22,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:38:22,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:38:23,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:23,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:38:25,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:38:26,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:26,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:38:26,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:38:27,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=19933.333333333332, ans=0.0 2023-09-28 12:38:28,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:30,196 INFO [train.py:1039] (2/4) Epoch 1, batch 3000, loss[loss=0.3592, simple_loss=0.39, pruned_loss=0.1642, over 23996.00 frames. ], tot_loss[loss=0.3717, simple_loss=0.3908, pruned_loss=0.1763, over 4715654.80 frames. ], batch size: 86, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:38:30,196 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 12:38:43,010 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.9158, 3.7390, 3.4655, 3.9745, 2.9751, 3.3678, 3.6466, 3.9221], device='cuda:2') 2023-09-28 12:38:44,452 INFO [train.py:1071] (2/4) Epoch 1, validation: loss=0.4132, simple_loss=0.3632, pruned_loss=0.2317, over 1125622.00 frames. 2023-09-28 12:38:44,453 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 12:38:44,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:38:44,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:44,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 12:38:48,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:49,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:38:51,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:38:54,521 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 12:38:55,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 12:38:57,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:59,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:38:59,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 12:38:59,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:02,621 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.319e+02 3.597e+02 4.607e+02 6.753e+02 1.897e+03, threshold=9.214e+02, percent-clipped=10.0 2023-09-28 12:39:07,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:39:15,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:39:22,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=20133.333333333332, ans=0.0 2023-09-28 12:39:23,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 12:39:23,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:39:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:39:27,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:28,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:39:31,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:31,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 12:39:33,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 12:39:35,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:39:35,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:39:37,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:39:38,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:39,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:39,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:39:43,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:39:43,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:43,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:39:46,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:46,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 12:39:47,155 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.72 vs. limit=10.0 2023-09-28 12:39:47,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:39:49,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:39:49,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:39:53,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:55,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:56,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:39:57,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 12:39:57,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:39:57,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 12:39:59,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:39:59,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 12:40:02,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:03,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:40:03,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 12:40:05,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 12:40:05,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:40:07,447 INFO [train.py:1039] (2/4) Epoch 1, batch 3050, loss[loss=0.3872, simple_loss=0.3902, pruned_loss=0.1921, over 23314.00 frames. ], tot_loss[loss=0.3695, simple_loss=0.3893, pruned_loss=0.1749, over 4723468.41 frames. ], batch size: 285, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:40:07,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:40:09,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:40:09,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:40:09,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:10,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:40:11,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 12:40:13,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:15,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:15,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:40:17,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=20333.333333333332, ans=0.1 2023-09-28 12:40:20,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:22,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 12:40:31,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 12:40:31,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 12:40:32,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=27.63 vs. limit=22.5 2023-09-28 12:40:32,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:37,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:40:39,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:39,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:41,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:44,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:40:45,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:45,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:46,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:46,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:46,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=20466.666666666668, ans=0.0 2023-09-28 12:40:47,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:49,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:50,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:52,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 12:40:52,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:52,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:40:55,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=20533.333333333332, ans=0.1 2023-09-28 12:40:57,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:57,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:40:59,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:40:59,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:40:59,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=20533.333333333332, ans=0.07 2023-09-28 12:41:05,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:41:05,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:10,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=20533.333333333332, ans=22.5 2023-09-28 12:41:11,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=20533.333333333332, ans=0.2 2023-09-28 12:41:12,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:14,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:41:14,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:41:16,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:17,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:41:17,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:41:19,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 12:41:19,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:19,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:21,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 12:41:24,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:28,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:30,059 INFO [train.py:1039] (2/4) Epoch 1, batch 3100, loss[loss=0.3461, simple_loss=0.3847, pruned_loss=0.1538, over 24335.00 frames. ], tot_loss[loss=0.3679, simple_loss=0.3877, pruned_loss=0.1741, over 4721420.61 frames. ], batch size: 77, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:41:32,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:41:35,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:41:37,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 12:41:40,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 12:41:40,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 12:41:42,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:41:47,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:41:47,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:47,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=20733.333333333332, ans=0.125 2023-09-28 12:41:48,535 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.389e+02 3.317e+02 4.517e+02 6.154e+02 1.203e+03, threshold=9.035e+02, percent-clipped=5.0 2023-09-28 12:41:50,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:41:54,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:58,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 12:41:59,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=20733.333333333332, ans=0.0 2023-09-28 12:42:00,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=20733.333333333332, ans=0.0 2023-09-28 12:42:03,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:42:03,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=20800.0, ans=0.125 2023-09-28 12:42:04,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:04,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:04,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:04,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:42:07,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:42:07,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 12:42:07,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:42:08,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:10,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 12:42:12,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:42:15,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:42:17,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 12:42:17,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 12:42:18,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:20,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:24,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:24,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:24,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:42:25,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:42:25,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:42:28,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:42:28,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:42:28,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:28,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 12:42:31,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:33,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 12:42:36,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:42:36,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 12:42:36,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:37,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:38,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 12:42:40,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=20933.333333333332, ans=0.125 2023-09-28 12:42:45,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=20933.333333333332, ans=0.125 2023-09-28 12:42:51,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 12:42:53,224 INFO [train.py:1039] (2/4) Epoch 1, batch 3150, loss[loss=0.3618, simple_loss=0.378, pruned_loss=0.1728, over 23493.00 frames. ], tot_loss[loss=0.3651, simple_loss=0.3851, pruned_loss=0.1725, over 4716952.21 frames. ], batch size: 134, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:42:54,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:42:56,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:57,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:57,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:42:57,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 12:43:00,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:00,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:43:01,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 12:43:03,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=21000.0, ans=0.125 2023-09-28 12:43:04,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:06,234 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 12:43:09,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 12:43:10,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:43:12,229 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 12:43:13,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:43:13,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 12:43:15,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 12:43:15,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 12:43:16,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:16,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:16,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:17,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 12:43:18,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=21066.666666666668, ans=0.1 2023-09-28 12:43:21,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:21,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:22,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=21066.666666666668, ans=0.125 2023-09-28 12:43:23,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:24,162 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.32 vs. limit=6.0 2023-09-28 12:43:24,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:43:28,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 12:43:28,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:43:31,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:43:31,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:33,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 12:43:34,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 12:43:36,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:43:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:43:36,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:43:37,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:37,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:43:39,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:43:39,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:43:39,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 12:43:40,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:43:40,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:42,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:43:42,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:42,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 12:43:44,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:45,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 12:43:45,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:46,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.11 vs. limit=15.0 2023-09-28 12:43:47,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 12:43:50,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 12:43:53,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:43:53,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:54,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 12:43:56,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:43:56,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:58,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:43:58,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=21266.666666666668, ans=0.125 2023-09-28 12:44:01,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:01,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:44:01,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=21266.666666666668, ans=0.1 2023-09-28 12:44:03,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=21266.666666666668, ans=0.2 2023-09-28 12:44:06,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:44:06,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:09,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:44:16,257 INFO [train.py:1039] (2/4) Epoch 1, batch 3200, loss[loss=0.3362, simple_loss=0.3639, pruned_loss=0.1543, over 23298.00 frames. ], tot_loss[loss=0.3631, simple_loss=0.3832, pruned_loss=0.1715, over 4711403.40 frames. ], batch size: 119, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:44:16,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:44:16,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:44:20,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:23,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:44:23,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 12:44:24,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=21333.333333333332, ans=0.125 2023-09-28 12:44:26,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:44:30,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:44:33,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:35,009 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.369e+02 3.557e+02 4.560e+02 5.822e+02 1.709e+03, threshold=9.121e+02, percent-clipped=8.0 2023-09-28 12:44:42,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:44:47,849 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.48 vs. limit=22.5 2023-09-28 12:44:50,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 12:44:51,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:44:55,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 12:44:57,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:45:00,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:45:00,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:45:02,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:45:05,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 12:45:08,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:45:10,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 12:45:12,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.60 vs. limit=22.5 2023-09-28 12:45:15,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 12:45:16,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:45:22,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:22,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:45:22,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:24,405 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 12:45:24,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:45:26,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=21600.0, ans=0.1 2023-09-28 12:45:29,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:45:32,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 12:45:32,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 12:45:34,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 12:45:36,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 12:45:38,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:45:38,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=21666.666666666668, ans=0.006159420289855073 2023-09-28 12:45:39,877 INFO [train.py:1039] (2/4) Epoch 1, batch 3250, loss[loss=0.3239, simple_loss=0.3646, pruned_loss=0.1416, over 24496.00 frames. ], tot_loss[loss=0.3609, simple_loss=0.3824, pruned_loss=0.1697, over 4726609.51 frames. ], batch size: 63, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:45:40,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:45:40,152 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 12:45:41,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:45:41,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:45:41,691 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 12:45:41,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=21666.666666666668, ans=0.0 2023-09-28 12:45:45,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=21666.666666666668, ans=0.125 2023-09-28 12:45:46,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:45:49,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:45:59,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:45:59,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 12:45:59,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:00,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:00,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:00,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:02,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:46:04,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:04,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:46:06,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:06,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:06,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=21733.333333333332, ans=0.125 2023-09-28 12:46:11,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:13,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:14,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:14,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:16,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:16,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:17,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:23,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 12:46:24,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:46:24,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:46:26,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:26,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:46:32,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:46:37,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=21866.666666666668, ans=0.125 2023-09-28 12:46:40,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:46:41,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:41,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 12:46:41,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:46:41,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:46:42,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:44,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 12:46:45,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 12:46:45,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:47,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:49,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:49,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:46:49,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:52,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:52,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:54,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 12:46:54,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:46:57,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:46:57,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 12:46:58,395 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-09-28 12:47:00,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:47:00,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 12:47:02,061 INFO [train.py:1039] (2/4) Epoch 1, batch 3300, loss[loss=0.5204, simple_loss=0.4792, pruned_loss=0.2808, over 19451.00 frames. ], tot_loss[loss=0.3612, simple_loss=0.3828, pruned_loss=0.1698, over 4710823.28 frames. ], batch size: 389, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:47:02,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 12:47:03,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 12:47:03,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:06,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:47:09,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:47:09,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:12,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:47:12,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:47:14,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:16,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:47:20,468 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.229e+02 3.284e+02 5.093e+02 6.809e+02 1.583e+03, threshold=1.019e+03, percent-clipped=11.0 2023-09-28 12:47:22,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=22066.666666666668, ans=0.0 2023-09-28 12:47:24,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 12:47:25,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:25,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:27,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:27,282 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 12:47:27,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:47:28,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:47:30,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:47:30,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:47:30,408 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 12:47:35,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:35,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:47:37,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:37,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 12:47:38,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 12:47:38,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:40,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:47:41,649 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 12:47:43,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 12:47:45,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:47:46,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 12:47:46,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=22133.333333333332, ans=0.5 2023-09-28 12:47:48,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:47:51,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:47:51,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:47:52,660 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.39 vs. limit=15.0 2023-09-28 12:47:54,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:54,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:54,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:54,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:47:57,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:47:58,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:58,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:48:00,206 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 12:48:01,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 12:48:04,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:48:04,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:04,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:07,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:48:07,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:07,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:48:08,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:08,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:48:10,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:48:11,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:48:15,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 12:48:15,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:16,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:19,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:48:20,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:48:21,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:23,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:23,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:24,662 INFO [train.py:1039] (2/4) Epoch 1, batch 3350, loss[loss=0.3767, simple_loss=0.4007, pruned_loss=0.1764, over 23760.00 frames. ], tot_loss[loss=0.3632, simple_loss=0.3849, pruned_loss=0.1708, over 4709879.95 frames. ], batch size: 85, lr: 4.30e-02, grad_scale: 16.0 2023-09-28 12:48:25,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:48:26,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:28,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:48:30,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:33,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:48:35,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:36,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:48:39,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 12:48:39,292 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 12:48:40,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:43,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 12:48:43,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 12:48:46,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:48:46,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:48:48,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:48,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 12:48:48,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:48,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:48:52,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:55,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:55,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:55,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:48:56,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=22466.666666666668, ans=0.125 2023-09-28 12:48:59,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:03,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:03,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:03,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=22466.666666666668, ans=0.0 2023-09-28 12:49:03,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=22466.666666666668, ans=0.1 2023-09-28 12:49:08,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:49:08,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:49:09,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.49 vs. limit=6.0 2023-09-28 12:49:11,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:11,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:13,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:16,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 12:49:16,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:49:17,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 12:49:17,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:49:18,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 12:49:20,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:21,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:24,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=22533.333333333332, ans=0.1 2023-09-28 12:49:29,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:31,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 12:49:32,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:49:32,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:49:35,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:49:41,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:49:41,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=22600.0, ans=0.1 2023-09-28 12:49:44,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 12:49:44,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:49:44,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:49:48,229 INFO [train.py:1039] (2/4) Epoch 1, batch 3400, loss[loss=0.3396, simple_loss=0.3627, pruned_loss=0.1582, over 24438.00 frames. ], tot_loss[loss=0.3619, simple_loss=0.3845, pruned_loss=0.1697, over 4716849.21 frames. ], batch size: 58, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:49:48,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:48,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 12:49:48,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:50,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 12:49:50,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=22666.666666666668, ans=0.0 2023-09-28 12:49:52,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:52,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:53,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:49:53,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:49:54,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 12:49:59,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 12:49:59,528 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 12:49:59,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:01,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=22666.666666666668, ans=0.125 2023-09-28 12:50:03,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:50:03,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:50:04,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:06,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:50:07,749 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.237e+02 3.122e+02 3.897e+02 5.653e+02 2.230e+03, threshold=7.795e+02, percent-clipped=8.0 2023-09-28 12:50:11,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:12,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 12:50:17,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:50:18,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=22733.333333333332, ans=0.005927536231884059 2023-09-28 12:50:19,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:19,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:21,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:50:23,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=22800.0, ans=0.125 2023-09-28 12:50:28,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:50:34,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 12:50:40,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 12:50:41,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:50:42,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:42,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:43,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:50:48,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:52,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:50:52,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:50:58,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:00,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 12:51:05,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:51:09,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 12:51:10,441 INFO [train.py:1039] (2/4) Epoch 1, batch 3450, loss[loss=0.351, simple_loss=0.3657, pruned_loss=0.1681, over 23717.00 frames. ], tot_loss[loss=0.3605, simple_loss=0.3833, pruned_loss=0.1689, over 4714592.20 frames. ], batch size: 232, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:51:12,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 12:51:14,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:16,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:51:16,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 12:51:17,727 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.65 vs. limit=22.5 2023-09-28 12:51:18,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:22,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:51:26,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:51:28,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:29,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:51:29,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:32,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:38,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 12:51:44,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 12:51:44,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:51:44,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:51:46,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:49,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.00 vs. limit=22.5 2023-09-28 12:51:50,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 12:51:52,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:51:56,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:51:56,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:58,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:51:59,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:52:03,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 12:52:03,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:04,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:52:08,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:10,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 12:52:11,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:52:17,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:52:19,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:23,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:27,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:27,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:52:29,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:52:29,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:32,288 INFO [train.py:1039] (2/4) Epoch 1, batch 3500, loss[loss=0.3374, simple_loss=0.385, pruned_loss=0.1449, over 24498.00 frames. ], tot_loss[loss=0.3586, simple_loss=0.3809, pruned_loss=0.1681, over 4705741.39 frames. ], batch size: 66, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:52:33,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=23333.333333333332, ans=22.5 2023-09-28 12:52:34,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:37,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:52:39,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 12:52:41,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:52:45,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 12:52:48,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:48,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 12:52:51,806 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.155e+02 3.379e+02 4.182e+02 5.188e+02 1.059e+03, threshold=8.364e+02, percent-clipped=3.0 2023-09-28 12:52:55,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:52:55,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:57,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:52:57,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:52:57,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:52:59,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:59,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:52:59,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 12:53:00,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:02,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:53:03,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:04,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=23466.666666666668, ans=0.005768115942028985 2023-09-28 12:53:07,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:08,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 12:53:08,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:53:11,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:12,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=23466.666666666668, ans=15.0 2023-09-28 12:53:14,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:53:15,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:17,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:53:17,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:20,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 12:53:20,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 12:53:20,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 12:53:20,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=23533.333333333332, ans=0.0 2023-09-28 12:53:22,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:22,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=23533.333333333332, ans=0.125 2023-09-28 12:53:23,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:25,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:25,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:53:28,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=23533.333333333332, ans=0.0 2023-09-28 12:53:29,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:53:30,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:53:33,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.37 vs. limit=10.0 2023-09-28 12:53:35,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:53:37,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 12:53:37,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 12:53:37,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:53:37,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.32 vs. limit=15.0 2023-09-28 12:53:39,823 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.26 vs. limit=10.0 2023-09-28 12:53:40,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:40,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:41,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:46,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 12:53:46,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:48,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:48,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=23600.0, ans=0.125 2023-09-28 12:53:50,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 12:53:51,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 12:53:53,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:55,348 INFO [train.py:1039] (2/4) Epoch 1, batch 3550, loss[loss=0.3607, simple_loss=0.3926, pruned_loss=0.1644, over 23823.00 frames. ], tot_loss[loss=0.356, simple_loss=0.3784, pruned_loss=0.1668, over 4681412.31 frames. ], batch size: 85, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:53:55,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:55,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:53:57,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:53:58,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.55 vs. limit=22.5 2023-09-28 12:54:00,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:54:10,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:12,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:54:13,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=23733.333333333332, ans=0.1 2023-09-28 12:54:15,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:16,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:54:18,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:18,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:54:18,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:54:21,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=23733.333333333332, ans=0.04949747468305833 2023-09-28 12:54:23,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:23,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:54:23,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:23,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:54:24,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:54:31,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=23800.0, ans=0.125 2023-09-28 12:54:33,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:54:33,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:34,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:34,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:36,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:54:36,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 12:54:36,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:54:42,334 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.19 vs. limit=15.0 2023-09-28 12:54:42,404 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.08 vs. limit=6.0 2023-09-28 12:54:43,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:44,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:46,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:47,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 12:54:49,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:54:50,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 12:54:50,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:53,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:54:53,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:54:56,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.66 vs. limit=22.5 2023-09-28 12:54:58,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 12:55:00,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:01,430 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.77 vs. limit=22.5 2023-09-28 12:55:07,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:07,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 12:55:07,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=23933.333333333332, ans=0.125 2023-09-28 12:55:08,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:09,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=23933.333333333332, ans=0.5 2023-09-28 12:55:11,289 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.87 vs. limit=22.5 2023-09-28 12:55:14,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:55:14,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 12:55:17,948 INFO [train.py:1039] (2/4) Epoch 1, batch 3600, loss[loss=0.3411, simple_loss=0.3675, pruned_loss=0.1574, over 23794.00 frames. ], tot_loss[loss=0.3531, simple_loss=0.3774, pruned_loss=0.1644, over 4693973.15 frames. ], batch size: 212, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:55:21,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 12:55:21,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:55:22,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:55:25,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:26,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:27,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:55:30,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:32,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:32,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=24066.666666666668, ans=0.1 2023-09-28 12:55:33,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:55:36,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:55:37,402 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.926e+02 4.483e+02 7.377e+02 1.636e+03, threshold=8.966e+02, percent-clipped=15.0 2023-09-28 12:55:37,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:37,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 12:55:37,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=24066.666666666668, ans=0.0 2023-09-28 12:55:41,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:55:42,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:44,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=24066.666666666668, ans=0.1 2023-09-28 12:55:45,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:49,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:50,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:55:51,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:51,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 12:55:51,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:54,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:54,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=24133.333333333332, ans=0.0 2023-09-28 12:55:56,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:55:56,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:57,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:59,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:55:59,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 12:56:06,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:07,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:56:09,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 12:56:12,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:56:18,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:21,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:23,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=24266.666666666668, ans=0.1 2023-09-28 12:56:27,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:56:27,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:56:27,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 12:56:29,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 12:56:31,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 12:56:34,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:56:35,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:56:36,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 12:56:36,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:56:36,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:56:36,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:38,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 12:56:39,577 INFO [train.py:1039] (2/4) Epoch 1, batch 3650, loss[loss=0.3008, simple_loss=0.3418, pruned_loss=0.1299, over 24317.00 frames. ], tot_loss[loss=0.3539, simple_loss=0.3788, pruned_loss=0.1645, over 4701958.60 frames. ], batch size: 56, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:56:39,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 12:56:43,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:43,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 12:56:50,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 12:56:53,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:56:57,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 12:56:59,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 12:57:00,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=24400.0, ans=0.02 2023-09-28 12:57:03,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:03,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:57:03,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:57:04,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=24400.0, ans=0.125 2023-09-28 12:57:06,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:57:06,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:57:08,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 12:57:09,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:57:09,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:09,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 12:57:11,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:57:12,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:12,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:14,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:57:18,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 12:57:19,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 12:57:19,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:57:22,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 12:57:24,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:24,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:57:25,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=24466.666666666668, ans=0.0055507246376811595 2023-09-28 12:57:33,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:57:35,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:35,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:57:35,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:57:35,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:57:37,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=24533.333333333332, ans=0.125 2023-09-28 12:57:38,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:57:40,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:40,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.39 vs. limit=15.0 2023-09-28 12:57:41,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:41,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:43,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:57:46,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:46,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:52,469 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 12:57:56,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:56,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:58,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:57:58,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:00,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:58:02,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:03,237 INFO [train.py:1039] (2/4) Epoch 1, batch 3700, loss[loss=0.3736, simple_loss=0.3873, pruned_loss=0.18, over 23964.00 frames. ], tot_loss[loss=0.3559, simple_loss=0.3799, pruned_loss=0.166, over 4684331.84 frames. ], batch size: 196, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:58:04,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 12:58:04,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:07,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:58:10,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:58:10,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:58:11,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:11,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 12:58:13,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:14,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:58:14,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:58:18,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:58:22,279 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.123e+02 3.422e+02 4.027e+02 5.760e+02 1.496e+03, threshold=8.053e+02, percent-clipped=7.0 2023-09-28 12:58:22,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:58:22,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=24733.333333333332, ans=0.125 2023-09-28 12:58:23,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:25,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:58:25,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:25,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:58:28,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:28,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 12:58:39,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:58:40,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:58:41,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:58:41,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 12:58:41,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:44,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:46,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 12:58:47,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:49,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:58:50,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:52,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:58:53,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:58:58,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:58,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 12:58:58,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:58,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 12:59:03,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:59:03,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:59:06,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:08,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 12:59:09,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:59:09,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:59:09,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:11,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:12,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=20.14 vs. limit=15.0 2023-09-28 12:59:13,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:15,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 12:59:16,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 12:59:16,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:59:18,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:19,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:59:19,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:59:21,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=24933.333333333332, ans=0.04949747468305833 2023-09-28 12:59:22,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:59:24,347 INFO [train.py:1039] (2/4) Epoch 1, batch 3750, loss[loss=0.2963, simple_loss=0.3395, pruned_loss=0.1265, over 24626.00 frames. ], tot_loss[loss=0.3566, simple_loss=0.3808, pruned_loss=0.1662, over 4702216.53 frames. ], batch size: 60, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:59:24,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:59:25,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:59:27,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 12:59:29,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 12:59:32,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:59:32,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 12:59:33,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:59:35,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:35,471 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:59:37,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:39,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:59:41,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:45,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:59:46,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:59:49,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:51,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:59:53,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 12:59:53,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:59:54,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:59:54,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:59,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 13:00:03,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 13:00:03,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:00:03,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:00:05,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:07,984 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.53 vs. limit=15.0 2023-09-28 13:00:12,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:12,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:00:15,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=25200.0, ans=0.125 2023-09-28 13:00:17,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 13:00:21,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:25,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:00:25,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:00:27,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=25200.0, ans=10.0 2023-09-28 13:00:30,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:00:32,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=25266.666666666668, ans=0.2 2023-09-28 13:00:32,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=25266.666666666668, ans=0.0 2023-09-28 13:00:34,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 13:00:34,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:00:36,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:00:38,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:00:39,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:00:46,951 INFO [train.py:1039] (2/4) Epoch 1, batch 3800, loss[loss=0.3324, simple_loss=0.351, pruned_loss=0.1569, over 23791.00 frames. ], tot_loss[loss=0.3546, simple_loss=0.3797, pruned_loss=0.1648, over 4698994.74 frames. ], batch size: 195, lr: 4.25e-02, grad_scale: 32.0 2023-09-28 13:00:49,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:00:51,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=25333.333333333332, ans=0.0053623188405797105 2023-09-28 13:00:52,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:54,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:00:55,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 13:00:55,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:57,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:00:57,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=25333.333333333332, ans=0.1 2023-09-28 13:00:59,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=25333.333333333332, ans=0.2 2023-09-28 13:00:59,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=25333.333333333332, ans=0.125 2023-09-28 13:01:00,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:01:01,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:01:01,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:02,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:01:02,347 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:01:02,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.10 vs. limit=22.5 2023-09-28 13:01:03,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:01:05,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:01:06,430 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.982e+02 3.306e+02 4.581e+02 6.587e+02 1.016e+03, threshold=9.163e+02, percent-clipped=14.0 2023-09-28 13:01:06,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:06,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 13:01:11,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:01:12,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:01:14,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:14,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=25400.0, ans=0.125 2023-09-28 13:01:17,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:01:17,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:01:19,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:01:19,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:20,977 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.88 vs. limit=15.0 2023-09-28 13:01:23,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:24,468 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=30.78 vs. limit=22.5 2023-09-28 13:01:25,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:29,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:01:31,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 13:01:33,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:39,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:01:43,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:01:45,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 13:01:49,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 13:01:50,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:52,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:54,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:54,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 13:01:57,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.36 vs. limit=6.0 2023-09-28 13:02:00,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 13:02:00,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 13:02:00,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:01,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:02:06,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:02:07,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:02:09,336 INFO [train.py:1039] (2/4) Epoch 1, batch 3850, loss[loss=0.3162, simple_loss=0.3462, pruned_loss=0.1431, over 24368.00 frames. ], tot_loss[loss=0.3524, simple_loss=0.3787, pruned_loss=0.1631, over 4719273.22 frames. ], batch size: 56, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:02:12,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:02:14,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 13:02:14,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:02:14,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:18,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:02:18,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=25666.666666666668, ans=0.125 2023-09-28 13:02:19,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:23,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:02:23,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 13:02:33,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:33,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:36,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:36,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:02:39,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:41,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:02:43,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:43,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:02:44,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:46,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:47,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:47,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:02:49,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 13:02:49,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 13:02:49,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:50,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:54,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:02:55,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:55,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 13:02:59,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 13:02:59,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:01,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 13:03:03,812 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.17 vs. limit=15.0 2023-09-28 13:03:04,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:03:08,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=25866.666666666668, ans=0.2 2023-09-28 13:03:10,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.21 vs. limit=10.0 2023-09-28 13:03:11,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:11,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:03:14,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=25933.333333333332, ans=0.125 2023-09-28 13:03:16,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:16,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 13:03:17,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 13:03:20,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:20,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:22,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=25933.333333333332, ans=0.125 2023-09-28 13:03:23,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:03:23,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:03:25,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:25,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:25,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:03:27,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 13:03:27,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:03:29,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 13:03:29,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:29,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:30,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:03:31,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=26000.0, ans=0.125 2023-09-28 13:03:32,692 INFO [train.py:1039] (2/4) Epoch 1, batch 3900, loss[loss=0.3185, simple_loss=0.3585, pruned_loss=0.1392, over 24465.00 frames. ], tot_loss[loss=0.3505, simple_loss=0.3765, pruned_loss=0.1623, over 4705310.27 frames. ], batch size: 63, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:03:32,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:32,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:03:32,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:32,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:33,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.43 vs. limit=15.0 2023-09-28 13:03:34,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:34,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 13:03:34,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:38,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:38,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:38,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:03:41,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:44,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:44,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:46,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:03:47,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 13:03:47,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:03:49,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 13:03:50,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:51,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 13:03:52,282 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.956e+02 3.706e+02 4.742e+02 9.282e+02, threshold=7.412e+02, percent-clipped=1.0 2023-09-28 13:03:52,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 13:03:57,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:03:57,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:57,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:03:57,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:02,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:04:08,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:04:09,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:04:09,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:09,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=26133.333333333332, ans=0.025 2023-09-28 13:04:11,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:04:16,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=26133.333333333332, ans=0.2 2023-09-28 13:04:19,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:04:19,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=26133.333333333332, ans=0.00518840579710145 2023-09-28 13:04:20,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:04:28,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:04:28,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:04:29,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.26 vs. limit=15.0 2023-09-28 13:04:40,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:04:42,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:42,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 13:04:44,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 13:04:44,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:46,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 13:04:48,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=26266.666666666668, ans=0.07 2023-09-28 13:04:49,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:49,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 13:04:55,908 INFO [train.py:1039] (2/4) Epoch 1, batch 3950, loss[loss=0.3556, simple_loss=0.3815, pruned_loss=0.1649, over 23322.00 frames. ], tot_loss[loss=0.3482, simple_loss=0.3752, pruned_loss=0.1606, over 4716817.59 frames. ], batch size: 119, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:04:57,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:05:00,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 13:05:00,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:05:02,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:05:03,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:05:09,866 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 13:05:09,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:10,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 13:05:12,059 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 13:05:12,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:05:16,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:16,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:05:16,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:18,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 13:05:19,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=26400.0, ans=0.1 2023-09-28 13:05:20,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=26400.0, ans=0.0 2023-09-28 13:05:21,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:05:23,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:23,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:05:23,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:05:25,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:05:25,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=26400.0, ans=0.05 2023-09-28 13:05:28,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=26466.666666666668, ans=0.2 2023-09-28 13:05:30,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=26466.666666666668, ans=0.0 2023-09-28 13:05:37,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:05:37,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:05:42,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 13:05:47,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 13:05:47,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 13:05:47,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:05:48,555 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.13 vs. limit=15.0 2023-09-28 13:05:49,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:05:49,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=26533.333333333332, ans=0.125 2023-09-28 13:05:55,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=26533.333333333332, ans=0.125 2023-09-28 13:05:58,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:06:00,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:06:00,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:00,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:06:01,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 13:06:03,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=26600.0, ans=0.125 2023-09-28 13:06:06,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:06:06,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:06:08,653 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=19.28 vs. limit=22.5 2023-09-28 13:06:11,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 13:06:13,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.91 vs. limit=22.5 2023-09-28 13:06:20,838 INFO [train.py:1039] (2/4) Epoch 1, batch 4000, loss[loss=0.3486, simple_loss=0.3691, pruned_loss=0.1641, over 23426.00 frames. ], tot_loss[loss=0.3496, simple_loss=0.3763, pruned_loss=0.1615, over 4714780.69 frames. ], batch size: 285, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:06:24,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=26666.666666666668, ans=0.125 2023-09-28 13:06:26,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:31,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:38,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:38,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:06:40,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:40,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 13:06:40,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:06:41,651 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.100e+02 3.230e+02 4.293e+02 5.644e+02 1.126e+03, threshold=8.585e+02, percent-clipped=11.0 2023-09-28 13:06:41,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 13:06:41,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:06:41,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 13:06:43,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:47,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:06:47,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:06:47,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:06:47,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:47,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:06:49,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:06:50,916 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 13:06:51,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=26733.333333333332, ans=0.125 2023-09-28 13:06:52,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:06:52,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:06:55,627 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 13:06:57,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:06:57,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:05,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 13:07:05,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:07:07,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:07:09,126 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 13:07:09,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:07:10,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 13:07:10,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:07:10,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:12,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:07:14,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:07:15,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:07:15,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:17,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 13:07:18,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:20,041 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 13:07:24,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:07:25,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=26933.333333333332, ans=0.0 2023-09-28 13:07:28,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 13:07:29,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:07:31,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:33,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:07:33,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:07:33,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:33,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:38,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:40,136 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.33 vs. limit=15.0 2023-09-28 13:07:42,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:07:42,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 13:07:42,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=27000.0, ans=0.0 2023-09-28 13:07:43,737 INFO [train.py:1039] (2/4) Epoch 1, batch 4050, loss[loss=0.3336, simple_loss=0.3778, pruned_loss=0.1447, over 24513.00 frames. ], tot_loss[loss=0.35, simple_loss=0.3772, pruned_loss=0.1614, over 4725751.21 frames. ], batch size: 66, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:07:45,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:07:45,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:07:46,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:07:48,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:07:49,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:50,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=27000.0, ans=0.2 2023-09-28 13:07:52,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.77 vs. limit=15.0 2023-09-28 13:07:53,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:53,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=27000.0, ans=0.005 2023-09-28 13:07:54,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:07:56,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:07:57,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:07:58,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=27066.666666666668, ans=0.0 2023-09-28 13:07:59,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:08:02,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:04,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:08:07,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 13:08:09,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 13:08:09,171 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 13:08:12,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:08:17,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=27133.333333333332, ans=0.125 2023-09-28 13:08:21,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 13:08:22,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:23,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=27133.333333333332, ans=0.02 2023-09-28 13:08:26,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:26,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=27133.333333333332, ans=0.125 2023-09-28 13:08:27,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=27133.333333333332, ans=0.125 2023-09-28 13:08:29,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:29,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:08:29,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:32,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:08:35,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 13:08:35,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:08:37,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:39,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 13:08:43,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:52,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 13:08:52,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:52,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:08:54,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=27266.666666666668, ans=0.125 2023-09-28 13:08:56,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 13:08:56,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 13:08:56,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:08:57,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:08:58,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=27266.666666666668, ans=0.0 2023-09-28 13:08:59,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:08:59,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:09:01,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=27266.666666666668, ans=0.125 2023-09-28 13:09:05,402 INFO [train.py:1039] (2/4) Epoch 1, batch 4100, loss[loss=0.2916, simple_loss=0.3368, pruned_loss=0.1232, over 24655.00 frames. ], tot_loss[loss=0.35, simple_loss=0.3778, pruned_loss=0.1611, over 4727516.87 frames. ], batch size: 60, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:09:08,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 13:09:10,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 13:09:11,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=27333.333333333332, ans=0.1 2023-09-28 13:09:14,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 13:09:14,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 13:09:14,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:16,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:16,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:18,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:09:18,244 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 13:09:21,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=27400.0, ans=0.125 2023-09-28 13:09:22,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:24,294 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.870e+02 3.486e+02 4.089e+02 6.314e+02, threshold=6.972e+02, percent-clipped=0.0 2023-09-28 13:09:24,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:09:24,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:26,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:09:32,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:09:33,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:33,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:09:33,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 13:09:35,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:35,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:09:35,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:35,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:09:36,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 13:09:37,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=27466.666666666668, ans=0.125 2023-09-28 13:09:38,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:09:38,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 13:09:40,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:09:43,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:43,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 13:09:43,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=27466.666666666668, ans=0.004898550724637681 2023-09-28 13:09:46,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:09:46,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:09:47,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:09:49,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 13:09:49,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:09:50,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:09:55,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 13:09:56,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:57,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:09:57,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=27533.333333333332, ans=0.125 2023-09-28 13:10:01,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:08,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:09,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.16 vs. limit=15.0 2023-09-28 13:10:11,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:12,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:10:17,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.61 vs. limit=22.5 2023-09-28 13:10:17,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:17,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:18,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=27600.0, ans=0.1 2023-09-28 13:10:20,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:22,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=27600.0, ans=0.125 2023-09-28 13:10:24,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:10:27,476 INFO [train.py:1039] (2/4) Epoch 1, batch 4150, loss[loss=0.3483, simple_loss=0.3521, pruned_loss=0.1722, over 23459.00 frames. ], tot_loss[loss=0.3499, simple_loss=0.3777, pruned_loss=0.161, over 4735539.07 frames. ], batch size: 285, lr: 4.21e-02, grad_scale: 32.0 2023-09-28 13:10:29,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:10:30,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:10:32,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:10:32,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:34,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-09-28 13:10:35,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 13:10:35,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:35,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 13:10:37,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 13:10:37,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 13:10:38,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:44,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:10:44,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:47,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:10:47,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:10:47,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=27733.333333333332, ans=0.125 2023-09-28 13:10:49,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:10:50,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:10:51,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:52,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.50 vs. limit=15.0 2023-09-28 13:10:53,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:10:57,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:03,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:03,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 13:11:06,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 13:11:06,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:11:07,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 13:11:07,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:11:07,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:11,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:13,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:16,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 13:11:19,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:21,688 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:11:22,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:11:22,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 13:11:24,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:25,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 13:11:27,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:11:28,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:29,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=27866.666666666668, ans=0.125 2023-09-28 13:11:30,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:30,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 13:11:30,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:11:30,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:11:32,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:11:35,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 13:11:35,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:37,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:11:37,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:11:38,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 13:11:40,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:40,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:11:40,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:43,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:43,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 13:11:43,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:49,438 INFO [train.py:1039] (2/4) Epoch 1, batch 4200, loss[loss=0.3301, simple_loss=0.3601, pruned_loss=0.1501, over 23257.00 frames. ], tot_loss[loss=0.3478, simple_loss=0.3759, pruned_loss=0.1599, over 4722072.98 frames. ], batch size: 105, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:11:49,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:11:51,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 13:11:52,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:11:54,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:11:54,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:11:56,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:56,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:59,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 13:12:00,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=28000.0, ans=0.0 2023-09-28 13:12:01,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 13:12:02,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:05,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:08,512 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.106e+02 3.443e+02 4.074e+02 5.504e+02 1.074e+03, threshold=8.148e+02, percent-clipped=10.0 2023-09-28 13:12:08,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:12:13,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:12:14,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:15,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:16,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 13:12:16,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:18,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:18,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:12:18,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:12:22,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:12:23,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 13:12:23,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:28,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:12:28,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:12:30,805 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=13.99 vs. limit=15.0 2023-09-28 13:12:33,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:12:33,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:12:35,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:12:36,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 13:12:36,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:12:38,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:12:39,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=28200.0, ans=0.125 2023-09-28 13:12:44,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:12:46,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:46,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=28200.0, ans=0.004739130434782609 2023-09-28 13:12:51,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=28200.0, ans=0.1 2023-09-28 13:12:51,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=28200.0, ans=0.125 2023-09-28 13:12:52,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:12:55,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 13:12:58,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:03,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:13:03,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:06,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 13:13:08,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=28266.666666666668, ans=0.1 2023-09-28 13:13:11,087 INFO [train.py:1039] (2/4) Epoch 1, batch 4250, loss[loss=0.3432, simple_loss=0.3616, pruned_loss=0.1625, over 23427.00 frames. ], tot_loss[loss=0.3453, simple_loss=0.3741, pruned_loss=0.1582, over 4714746.66 frames. ], batch size: 285, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:13:11,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:13:14,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=28333.333333333332, ans=0.125 2023-09-28 13:13:15,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:13:17,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:13:20,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:25,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:13:25,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 13:13:27,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:13:29,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:33,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:37,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:38,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:40,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:13:41,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:13:41,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:43,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:43,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:46,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:13:48,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:49,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 13:13:51,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 13:13:51,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:52,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:53,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:53,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:13:53,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:55,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:58,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:14:00,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:14:04,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:05,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:07,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 13:14:07,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:14:07,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 13:14:07,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=28533.333333333332, ans=0.125 2023-09-28 13:14:10,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:14:12,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:14:13,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:13,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:14:16,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 13:14:18,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:14:19,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:14:23,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:26,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:28,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:14:28,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:31,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:32,635 INFO [train.py:1039] (2/4) Epoch 1, batch 4300, loss[loss=0.3367, simple_loss=0.3583, pruned_loss=0.1576, over 23808.00 frames. ], tot_loss[loss=0.3436, simple_loss=0.373, pruned_loss=0.1571, over 4716742.79 frames. ], batch size: 164, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:14:32,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:14:32,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:14:34,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 13:14:36,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:42,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:43,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:14:46,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:52,609 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.815e+02 2.945e+02 3.341e+02 4.373e+02 7.931e+02, threshold=6.681e+02, percent-clipped=0.0 2023-09-28 13:14:53,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:53,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 13:14:54,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:14:54,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=28733.333333333332, ans=0.004623188405797102 2023-09-28 13:14:59,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:14:59,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:14:59,610 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 13:15:02,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:15:04,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:07,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 13:15:08,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:15:08,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 13:15:11,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:15:12,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:15:12,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.98 vs. limit=10.0 2023-09-28 13:15:15,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:15:15,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:15:16,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:15:18,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:18,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=28800.0, ans=0.125 2023-09-28 13:15:19,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:15:19,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 13:15:19,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 13:15:20,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=28800.0, ans=0.05 2023-09-28 13:15:22,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:15:25,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:25,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:15:27,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:27,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:27,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 13:15:27,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 13:15:27,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=28866.666666666668, ans=0.125 2023-09-28 13:15:29,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 13:15:29,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:15:29,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 13:15:29,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 13:15:34,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:36,034 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 13:15:37,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:15:39,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:39,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:40,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 13:15:42,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:42,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:44,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:15:44,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:44,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:15:47,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:15:51,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:52,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:52,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:55,546 INFO [train.py:1039] (2/4) Epoch 1, batch 4350, loss[loss=0.3482, simple_loss=0.3885, pruned_loss=0.1539, over 24550.00 frames. ], tot_loss[loss=0.3449, simple_loss=0.3742, pruned_loss=0.1578, over 4712067.13 frames. ], batch size: 71, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:15:57,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 13:15:58,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:16:02,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:05,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:07,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:16:07,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:16:09,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=29000.0, ans=0.125 2023-09-28 13:16:12,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:16:17,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:19,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:16:19,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:16:19,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=29066.666666666668, ans=0.0 2023-09-28 13:16:24,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:16:27,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:16:28,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:16:30,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=29133.333333333332, ans=0.125 2023-09-28 13:16:33,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 13:16:33,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:33,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=29133.333333333332, ans=0.125 2023-09-28 13:16:37,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:41,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:44,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 13:16:48,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:16:50,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:16:55,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 13:16:56,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:16:58,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:16:59,832 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 13:16:59,932 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 13:16:59,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:01,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:02,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:17:02,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:02,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=29266.666666666668, ans=0.125 2023-09-28 13:17:03,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:03,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:05,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 13:17:05,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:05,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:05,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:06,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 13:17:07,118 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 13:17:07,125 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 13:17:07,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 13:17:10,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:17:10,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:17:12,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:12,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:17:15,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 13:17:18,355 INFO [train.py:1039] (2/4) Epoch 1, batch 4400, loss[loss=0.3105, simple_loss=0.3362, pruned_loss=0.1424, over 15886.00 frames. ], tot_loss[loss=0.3445, simple_loss=0.3741, pruned_loss=0.1575, over 4715195.95 frames. ], batch size: 34, lr: 4.18e-02, grad_scale: 32.0 2023-09-28 13:17:18,452 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 13:17:18,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:24,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:24,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:26,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:28,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 13:17:30,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 13:17:30,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 13:17:30,333 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 13:17:31,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:17:31,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:34,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 13:17:36,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:37,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:38,339 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.230e+02 3.318e+02 4.079e+02 5.261e+02 1.011e+03, threshold=8.157e+02, percent-clipped=12.0 2023-09-28 13:17:38,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 13:17:38,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=29400.0, ans=0.2 2023-09-28 13:17:41,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:41,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 13:17:41,681 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 13:17:41,870 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:17:44,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 13:17:44,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 13:17:45,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 13:17:45,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:47,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:48,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:50,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:50,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=29466.666666666668, ans=0.125 2023-09-28 13:17:51,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 13:17:51,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 13:17:53,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:54,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:17:54,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:56,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:56,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:56,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 13:17:58,680 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 13:17:58,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=29466.666666666668, ans=0.02 2023-09-28 13:18:03,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:09,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:18:12,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 13:18:15,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:18:18,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:20,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:18:21,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 13:18:21,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:18:22,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:18:22,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:18:22,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=29533.333333333332, ans=0.0 2023-09-28 13:18:23,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:18:28,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 13:18:29,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 13:18:31,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 13:18:31,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:18:31,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 13:18:31,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:18:35,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:18:38,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 13:18:41,496 INFO [train.py:1039] (2/4) Epoch 1, batch 4450, loss[loss=0.3733, simple_loss=0.384, pruned_loss=0.1813, over 23961.00 frames. ], tot_loss[loss=0.3451, simple_loss=0.3745, pruned_loss=0.1579, over 4698887.72 frames. ], batch size: 196, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:18:41,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:45,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:45,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:18:52,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:18:52,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:18:56,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:58,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:19:00,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:19:01,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:01,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=29733.333333333332, ans=0.125 2023-09-28 13:19:03,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 13:19:03,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:03,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=29733.333333333332, ans=0.2 2023-09-28 13:19:05,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:05,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:05,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:19:08,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:19:10,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=29733.333333333332, ans=0.1 2023-09-28 13:19:14,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:14,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:16,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:17,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:19,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:19:22,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:19:24,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 13:19:25,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 13:19:25,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:19:25,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=29800.0, ans=0.004391304347826087 2023-09-28 13:19:26,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:27,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=29800.0, ans=0.0 2023-09-28 13:19:28,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 13:19:31,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:19:37,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:37,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 13:19:37,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:37,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:37,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:19:37,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:37,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=29866.666666666668, ans=0.125 2023-09-28 13:19:40,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:44,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:19:44,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 13:19:46,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:19:49,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:50,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:52,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:53,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:19:56,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:19:59,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 13:20:01,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:20:03,927 INFO [train.py:1039] (2/4) Epoch 1, batch 4500, loss[loss=0.346, simple_loss=0.3884, pruned_loss=0.1518, over 24347.00 frames. ], tot_loss[loss=0.3444, simple_loss=0.3748, pruned_loss=0.157, over 4704048.93 frames. ], batch size: 74, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:20:07,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:08,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 13:20:08,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 13:20:10,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:15,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:20:17,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:17,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:20:18,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:20:18,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:19,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:23,486 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.079e+02 2.950e+02 3.479e+02 4.293e+02 6.506e+02, threshold=6.959e+02, percent-clipped=0.0 2023-09-28 13:20:33,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:33,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:20:37,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:20:39,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:20:39,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:20:45,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:20:49,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:20:53,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:20:54,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=30200.0, ans=0.1 2023-09-28 13:20:57,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:20:57,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 13:20:57,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:20:59,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:20:59,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.55 vs. limit=15.0 2023-09-28 13:21:02,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:02,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:21:05,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:21:05,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 13:21:05,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:21:05,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:09,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=30266.666666666668, ans=0.004289855072463768 2023-09-28 13:21:10,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:21:10,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:21:14,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:15,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:21:15,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:21:17,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 13:21:20,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 13:21:20,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 13:21:22,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 13:21:25,459 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.31 vs. limit=8.0 2023-09-28 13:21:26,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 13:21:26,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=30333.333333333332, ans=0.2 2023-09-28 13:21:27,380 INFO [train.py:1039] (2/4) Epoch 1, batch 4550, loss[loss=0.3612, simple_loss=0.3711, pruned_loss=0.1756, over 23961.00 frames. ], tot_loss[loss=0.3417, simple_loss=0.3721, pruned_loss=0.1556, over 4701451.05 frames. ], batch size: 196, lr: 4.16e-02, grad_scale: 32.0 2023-09-28 13:21:27,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:29,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=30333.333333333332, ans=0.125 2023-09-28 13:21:32,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:32,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:37,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:40,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:21:42,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:44,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:21:44,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:21:44,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:46,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:48,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:48,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=30400.0, ans=0.1 2023-09-28 13:21:51,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:21:53,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 13:21:55,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 13:21:55,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:21:56,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 13:22:01,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 13:22:02,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:07,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 13:22:09,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:22:13,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:13,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:13,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:22:16,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 13:22:19,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:19,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=30533.333333333332, ans=0.0 2023-09-28 13:22:19,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=30533.333333333332, ans=0.2 2023-09-28 13:22:20,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:20,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:22,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:24,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 13:22:24,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 13:22:24,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:22:24,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=30533.333333333332, ans=0.125 2023-09-28 13:22:26,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 13:22:28,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 13:22:28,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:28,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:29,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:31,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:31,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:22:31,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:22:32,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 13:22:35,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:35,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:22:35,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 13:22:35,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:22:37,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 13:22:40,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:22:40,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:22:41,656 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.92 vs. limit=22.5 2023-09-28 13:22:42,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:22:44,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:44,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:22:45,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:22:48,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:22:51,241 INFO [train.py:1039] (2/4) Epoch 1, batch 4600, loss[loss=0.3709, simple_loss=0.3855, pruned_loss=0.1782, over 23880.00 frames. ], tot_loss[loss=0.3395, simple_loss=0.3696, pruned_loss=0.1547, over 4680989.31 frames. ], batch size: 195, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:22:51,610 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:22:52,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:54,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:22:58,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:22:58,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:22:59,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 13:23:02,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:23:04,634 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.45 vs. limit=22.5 2023-09-28 13:23:05,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:23:06,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:08,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:08,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=30733.333333333332, ans=0.125 2023-09-28 13:23:11,495 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.016e+02 3.238e+02 3.793e+02 5.269e+02 1.285e+03, threshold=7.587e+02, percent-clipped=5.0 2023-09-28 13:23:15,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 13:23:17,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:20,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:20,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=30733.333333333332, ans=0.0 2023-09-28 13:23:25,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:23:25,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:31,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 13:23:31,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:23:32,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:23:37,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:39,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:23:40,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:23:42,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=30866.666666666668, ans=0.0 2023-09-28 13:23:44,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 13:23:44,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:23:49,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:49,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:23:51,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:51,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 13:23:52,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:52,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=30866.666666666668, ans=0.125 2023-09-28 13:23:53,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 13:23:53,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:55,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:56,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:57,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:57,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:57,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 13:23:57,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=30933.333333333332, ans=0.0 2023-09-28 13:23:59,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 13:23:59,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=30933.333333333332, ans=0.1 2023-09-28 13:24:00,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 13:24:00,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:02,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:02,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:04,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:24:05,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.54 vs. limit=22.5 2023-09-28 13:24:14,391 INFO [train.py:1039] (2/4) Epoch 1, batch 4650, loss[loss=0.3032, simple_loss=0.359, pruned_loss=0.1237, over 24628.00 frames. ], tot_loss[loss=0.3389, simple_loss=0.3694, pruned_loss=0.1542, over 4691507.78 frames. ], batch size: 68, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:24:14,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:24:18,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:18,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:19,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:24:19,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:19,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:21,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:24,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 13:24:27,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:24:28,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 13:24:28,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:30,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 13:24:30,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:24:32,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 13:24:32,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 13:24:32,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:33,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:24:36,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.60 vs. limit=22.5 2023-09-28 13:24:38,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:24:39,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:39,033 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 13:24:44,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:44,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=31066.666666666668, ans=0.0 2023-09-28 13:24:45,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 13:24:46,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=31133.333333333332, ans=0.00410144927536232 2023-09-28 13:24:48,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:48,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:24:50,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 13:24:51,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:24:55,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:24:58,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:03,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:06,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:07,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:09,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:25:11,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 13:25:12,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 13:25:14,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 13:25:14,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 13:25:15,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:15,784 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.50 vs. limit=15.0 2023-09-28 13:25:19,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=31266.666666666668, ans=0.1 2023-09-28 13:25:20,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=31266.666666666668, ans=0.125 2023-09-28 13:25:23,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:25:23,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:23,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 13:25:23,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:24,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:24,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:25:26,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:25:30,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:25:30,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:30,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:33,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:34,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:25:34,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:25:34,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 13:25:35,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:25:36,394 INFO [train.py:1039] (2/4) Epoch 1, batch 4700, loss[loss=0.3779, simple_loss=0.3994, pruned_loss=0.1782, over 23293.00 frames. ], tot_loss[loss=0.3416, simple_loss=0.3715, pruned_loss=0.1558, over 4697811.42 frames. ], batch size: 93, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:25:36,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 13:25:44,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:45,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:47,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:25:48,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:50,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:25:55,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 13:25:56,852 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.950e+02 3.070e+02 3.636e+02 4.699e+02 2.301e+03, threshold=7.272e+02, percent-clipped=9.0 2023-09-28 13:25:56,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 13:26:00,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:02,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:26:02,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:26:05,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:12,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:26:14,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:26:17,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:26:22,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 13:26:24,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:26:26,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:30,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 13:26:31,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:26:36,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:26:36,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 13:26:39,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:39,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:41,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:41,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:26:41,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 13:26:43,247 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 13:26:43,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:46,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 13:26:48,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=31600.0, ans=0.95 2023-09-28 13:26:49,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:52,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 13:26:55,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:26:55,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:26:59,832 INFO [train.py:1039] (2/4) Epoch 1, batch 4750, loss[loss=0.3174, simple_loss=0.3635, pruned_loss=0.1357, over 24456.00 frames. ], tot_loss[loss=0.3436, simple_loss=0.3727, pruned_loss=0.1573, over 4694277.63 frames. ], batch size: 66, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:27:03,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:03,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:27:03,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=31666.666666666668, ans=0.125 2023-09-28 13:27:04,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 13:27:04,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:08,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 13:27:10,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:27:10,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:27:11,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:16,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 13:27:19,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:27:19,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=31733.333333333332, ans=0.125 2023-09-28 13:27:22,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 13:27:23,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:26,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:26,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:26,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 13:27:27,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 13:27:33,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 13:27:36,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:39,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:27:40,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:27:40,775 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 13:27:40,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:27:44,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:27:46,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:27:49,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 13:27:49,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 13:27:49,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:49,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:27:50,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:52,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:27:52,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 13:27:53,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 13:27:54,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=31866.666666666668, ans=0.2 2023-09-28 13:27:56,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:27:57,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=31866.666666666668, ans=0.125 2023-09-28 13:28:00,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:28:00,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 13:28:00,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:02,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:04,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:28:04,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:05,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:28:09,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=31933.333333333332, ans=0.1 2023-09-28 13:28:10,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:10,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 13:28:11,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 13:28:11,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 13:28:15,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:28:15,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:16,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 13:28:21,661 INFO [train.py:1039] (2/4) Epoch 1, batch 4800, loss[loss=0.3391, simple_loss=0.388, pruned_loss=0.1451, over 24469.00 frames. ], tot_loss[loss=0.3424, simple_loss=0.3725, pruned_loss=0.1562, over 4711610.79 frames. ], batch size: 69, lr: 4.13e-02, grad_scale: 32.0 2023-09-28 13:28:22,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.70 vs. limit=15.0 2023-09-28 13:28:23,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:23,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:29,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:28:29,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=32000.0, ans=0.2 2023-09-28 13:28:30,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:30,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:31,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 13:28:32,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:32,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:28:36,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:28:41,494 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.966e+02 2.727e+02 3.187e+02 3.824e+02 7.207e+02, threshold=6.374e+02, percent-clipped=0.0 2023-09-28 13:28:41,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:28:43,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:43,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:28:44,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:44,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:28:44,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:46,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:49,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:54,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:54,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:55,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:28:57,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:28:59,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:00,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 13:29:02,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 13:29:02,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:02,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:29:03,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:29:03,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:03,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:29:06,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:29:06,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:10,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:16,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:18,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:23,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 13:29:23,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:23,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:23,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:29:25,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:28,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:29,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:29:29,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:30,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:29:30,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:29:32,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:29:36,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:36,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:36,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:37,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 13:29:39,321 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.69 vs. limit=15.0 2023-09-28 13:29:40,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 13:29:40,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:29:40,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:43,143 INFO [train.py:1039] (2/4) Epoch 1, batch 4850, loss[loss=0.3521, simple_loss=0.3668, pruned_loss=0.1687, over 23797.00 frames. ], tot_loss[loss=0.3414, simple_loss=0.3721, pruned_loss=0.1554, over 4722482.87 frames. ], batch size: 179, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:29:43,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:53,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 13:29:55,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:58,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=32400.0, ans=0.0 2023-09-28 13:30:01,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:02,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:30:02,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:05,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=32400.0, ans=0.2 2023-09-28 13:30:06,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:30:06,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=12.0 2023-09-28 13:30:07,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:30:07,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:30:07,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 13:30:12,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:30:15,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:30:15,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:30:15,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:30:15,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 13:30:17,479 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.42 vs. limit=15.0 2023-09-28 13:30:18,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:18,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 13:30:26,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 13:30:27,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:30:31,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.38 vs. limit=15.0 2023-09-28 13:30:35,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=32533.333333333332, ans=0.125 2023-09-28 13:30:36,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:30:36,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 13:30:37,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:30:37,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:30:39,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:30:39,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 13:30:39,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:43,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 13:30:43,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:43,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=32533.333333333332, ans=0.125 2023-09-28 13:30:44,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:30:45,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 13:30:54,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:59,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:31:00,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:05,067 INFO [train.py:1039] (2/4) Epoch 1, batch 4900, loss[loss=0.3028, simple_loss=0.3615, pruned_loss=0.1221, over 24636.00 frames. ], tot_loss[loss=0.3371, simple_loss=0.3694, pruned_loss=0.1524, over 4740707.83 frames. ], batch size: 68, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:31:05,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 13:31:05,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:31:12,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:12,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:12,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:31:17,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 13:31:22,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 13:31:23,895 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.271e+02 3.049e+02 3.510e+02 4.577e+02 9.864e+02, threshold=7.020e+02, percent-clipped=4.0 2023-09-28 13:31:25,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 13:31:27,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 13:31:27,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:29,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:29,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:31:29,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:29,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:31:31,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 13:31:33,355 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=44.74 vs. limit=15.0 2023-09-28 13:31:35,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 13:31:37,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:31:39,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:31:39,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:39,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=32800.0, ans=0.125 2023-09-28 13:31:41,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:31:42,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:44,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:31:44,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 13:31:47,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:31:48,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:50,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 13:31:50,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 13:31:53,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 13:31:55,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:31:55,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:31:56,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.49 vs. limit=22.5 2023-09-28 13:31:57,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:31:57,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:57,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:31:57,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:31:58,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 13:31:59,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=32866.666666666664, ans=0.125 2023-09-28 13:32:02,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:03,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:32:06,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:32:09,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 13:32:09,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:32:10,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 13:32:10,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 13:32:18,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:20,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:32:21,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 13:32:22,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=32933.333333333336, ans=0.0 2023-09-28 13:32:23,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:23,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:32:23,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:27,203 INFO [train.py:1039] (2/4) Epoch 1, batch 4950, loss[loss=0.3571, simple_loss=0.3881, pruned_loss=0.1631, over 23939.00 frames. ], tot_loss[loss=0.3364, simple_loss=0.3684, pruned_loss=0.1522, over 4742515.19 frames. ], batch size: 86, lr: 4.11e-02, grad_scale: 32.0 2023-09-28 13:32:28,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:28,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:32:28,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:28,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:32:29,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=33000.0, ans=0.125 2023-09-28 13:32:30,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:32:30,693 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:32:33,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:35,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:37,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 13:32:38,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 13:32:38,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:32:40,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 13:32:40,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:40,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:32:40,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:32:40,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:32:41,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.49 vs. limit=15.0 2023-09-28 13:32:42,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:43,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:32:45,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:32:46,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:48,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:48,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:53,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:32:55,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=33066.666666666664, ans=0.5 2023-09-28 13:32:58,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:59,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=33133.333333333336, ans=0.003666666666666666 2023-09-28 13:33:00,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:33:01,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.31 vs. limit=15.0 2023-09-28 13:33:02,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:02,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:03,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:33:05,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 13:33:05,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 13:33:09,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:10,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:33:12,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:33:12,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:33:12,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:33:14,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:33:16,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:19,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:33:20,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:33:22,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:23,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:23,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 13:33:25,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:33:25,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:33:26,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.86 vs. limit=22.5 2023-09-28 13:33:30,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:33:32,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:33:32,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:33:32,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:34,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:33:34,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:33:37,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:33:37,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:33:38,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:38,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 13:33:42,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:33:42,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=33266.666666666664, ans=0.0036376811594202906 2023-09-28 13:33:49,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 13:33:50,626 INFO [train.py:1039] (2/4) Epoch 1, batch 5000, loss[loss=0.3642, simple_loss=0.3544, pruned_loss=0.187, over 19119.00 frames. ], tot_loss[loss=0.3369, simple_loss=0.368, pruned_loss=0.1529, over 4729791.45 frames. ], batch size: 389, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:33:50,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:33:54,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=33333.333333333336, ans=0.125 2023-09-28 13:33:57,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:57,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:33:58,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 13:34:02,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 13:34:04,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:04,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 13:34:05,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:34:06,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:34:07,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 13:34:07,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:07,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:09,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 13:34:09,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:09,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:10,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 13:34:12,240 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.427e+02 3.057e+02 3.609e+02 4.472e+02 7.216e+02, threshold=7.218e+02, percent-clipped=2.0 2023-09-28 13:34:12,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 13:34:12,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:34:12,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 13:34:12,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:34:13,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:15,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:34:15,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 13:34:15,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 13:34:17,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 13:34:17,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:18,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:19,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 13:34:19,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:34:19,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=33400.0, ans=0.0 2023-09-28 13:34:20,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:22,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:24,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:34:24,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=33466.666666666664, ans=0.0035942028985507255 2023-09-28 13:34:25,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 13:34:26,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:34:27,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:34:32,095 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 13:34:36,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:37,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:37,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:34:42,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 13:34:42,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:43,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:43,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:34:45,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 13:34:45,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:49,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:50,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:55,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 13:35:00,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:02,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=33600.0, ans=0.0 2023-09-28 13:35:03,055 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.41 vs. limit=15.0 2023-09-28 13:35:11,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:35:12,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:12,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:35:12,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:12,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:35:12,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:35:12,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=33666.666666666664, ans=0.125 2023-09-28 13:35:14,083 INFO [train.py:1039] (2/4) Epoch 1, batch 5050, loss[loss=0.4547, simple_loss=0.4339, pruned_loss=0.2378, over 19220.00 frames. ], tot_loss[loss=0.3374, simple_loss=0.3683, pruned_loss=0.1532, over 4719577.74 frames. ], batch size: 388, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:35:14,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:19,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:19,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 13:35:19,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:35:21,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=33666.666666666664, ans=0.5 2023-09-28 13:35:22,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:26,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:35:26,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 13:35:27,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:27,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:35:30,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:35:31,156 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:35:32,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:35:32,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:35:32,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=33733.333333333336, ans=0.125 2023-09-28 13:35:37,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=33733.333333333336, ans=6.0 2023-09-28 13:35:42,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 13:35:44,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:35:46,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:35:46,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 13:35:48,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:35:48,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:49,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:49,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:35:49,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 13:35:51,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 13:35:52,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:54,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:35:57,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:57,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 13:36:00,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:04,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 13:36:05,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:36:05,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:36:07,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:07,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:36:09,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:10,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.77 vs. limit=10.0 2023-09-28 13:36:12,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:36:12,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:12,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:36:14,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:36:14,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 13:36:15,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:36:18,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:36:19,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=33933.333333333336, ans=0.0 2023-09-28 13:36:23,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:23,071 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 13:36:23,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:36:23,814 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.62 vs. limit=6.0 2023-09-28 13:36:24,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:24,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:26,120 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 13:36:29,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:30,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 13:36:30,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:32,870 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.30 vs. limit=15.0 2023-09-28 13:36:33,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:34,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:34,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 13:36:36,981 INFO [train.py:1039] (2/4) Epoch 1, batch 5100, loss[loss=0.3301, simple_loss=0.3783, pruned_loss=0.1409, over 24412.00 frames. ], tot_loss[loss=0.3375, simple_loss=0.3685, pruned_loss=0.1532, over 4709989.88 frames. ], batch size: 77, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:36:37,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 13:36:37,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:37,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:36:39,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:36:42,567 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 13:36:46,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:47,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 13:36:49,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 13:36:49,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:49,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:50,524 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=17.95 vs. limit=22.5 2023-09-28 13:36:52,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:54,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 13:36:55,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 13:36:59,954 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 3.081e+02 3.841e+02 4.720e+02 7.459e+02, threshold=7.682e+02, percent-clipped=1.0 2023-09-28 13:37:00,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:37:01,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:37:06,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:37:09,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 13:37:09,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:13,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:37:13,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:37:14,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=34133.333333333336, ans=0.0 2023-09-28 13:37:17,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 13:37:20,079 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 13:37:21,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:21,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 13:37:21,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 13:37:26,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:26,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=34200.0, ans=0.1 2023-09-28 13:37:33,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:37:36,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 13:37:36,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=34200.0, ans=0.0 2023-09-28 13:37:38,235 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 13:37:38,259 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 13:37:40,372 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=12.0 2023-09-28 13:37:41,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 13:37:41,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:42,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 13:37:46,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 13:37:49,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:37:51,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:37:53,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 13:37:56,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:37:56,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 13:37:56,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=34266.666666666664, ans=0.125 2023-09-28 13:38:01,385 INFO [train.py:1039] (2/4) Epoch 1, batch 5150, loss[loss=0.322, simple_loss=0.37, pruned_loss=0.137, over 24551.00 frames. ], tot_loss[loss=0.3385, simple_loss=0.3698, pruned_loss=0.1536, over 4715479.67 frames. ], batch size: 71, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:38:03,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:38:03,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:38:03,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:38:04,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:38:04,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:38:06,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:38:06,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 13:38:06,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 13:38:08,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 13:38:08,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:38:08,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 13:38:10,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:11,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:38:13,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:14,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:19,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:38:19,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 13:38:19,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:19,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:38:23,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:38:23,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:23,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:25,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:38:25,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:38:25,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 13:38:25,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=34400.0, ans=0.125 2023-09-28 13:38:26,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:38:26,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:38:28,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:38:32,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 13:38:33,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:38:40,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:38:40,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=34466.666666666664, ans=0.1 2023-09-28 13:38:43,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 13:38:46,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:54,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:55,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:39:00,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:00,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:02,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 13:39:07,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:39:08,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:39:08,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:39:12,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:12,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:13,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 13:39:20,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:39:20,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.79 vs. limit=15.0 2023-09-28 13:39:21,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:39:24,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:39:24,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:39:24,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:39:25,921 INFO [train.py:1039] (2/4) Epoch 1, batch 5200, loss[loss=0.3468, simple_loss=0.3654, pruned_loss=0.1641, over 23604.00 frames. ], tot_loss[loss=0.339, simple_loss=0.3701, pruned_loss=0.1539, over 4712145.12 frames. ], batch size: 149, lr: 4.08e-02, grad_scale: 32.0 2023-09-28 13:39:26,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:39:26,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:39:26,322 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:39:27,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:39:30,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:39:32,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:39:35,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:39,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 13:39:41,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:39:42,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:44,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:44,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:39:44,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:45,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 13:39:47,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.249e+02 2.894e+02 3.453e+02 4.172e+02 7.980e+02, threshold=6.907e+02, percent-clipped=1.0 2023-09-28 13:39:47,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:39:49,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:51,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 13:39:54,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:39:55,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:39:55,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 13:39:56,326 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=26.38 vs. limit=22.5 2023-09-28 13:39:57,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 13:39:59,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 13:40:00,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:00,609 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 13:40:00,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:40:02,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:02,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:40:03,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 13:40:03,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:07,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:12,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 13:40:12,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 13:40:12,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 13:40:18,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 13:40:18,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:40:26,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:40:26,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:28,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 13:40:28,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:28,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 13:40:29,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:30,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:40:32,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=34933.333333333336, ans=0.1 2023-09-28 13:40:33,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:33,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:40:37,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:39,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:39,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:44,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=34933.333333333336, ans=0.5 2023-09-28 13:40:47,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:47,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 13:40:47,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:49,067 INFO [train.py:1039] (2/4) Epoch 1, batch 5250, loss[loss=0.3244, simple_loss=0.3664, pruned_loss=0.1412, over 23604.00 frames. ], tot_loss[loss=0.3365, simple_loss=0.3683, pruned_loss=0.1524, over 4720533.72 frames. ], batch size: 94, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:40:49,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:40:49,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:49,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:40:50,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:40:53,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:56,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:56,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:40:58,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:41:03,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:41:05,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:41:05,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=35066.666666666664, ans=0.1 2023-09-28 13:41:10,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:41:11,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:41:13,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 13:41:13,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:41:14,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:41:52,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=35266.666666666664, ans=0.1 2023-09-28 13:42:03,681 INFO [train.py:1039] (2/4) Epoch 1, batch 5300, loss[loss=0.3378, simple_loss=0.3745, pruned_loss=0.1506, over 24439.00 frames. ], tot_loss[loss=0.3351, simple_loss=0.3659, pruned_loss=0.1522, over 4708982.10 frames. ], batch size: 63, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:42:18,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:42:18,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 13:42:18,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 13:42:18,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:19,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:19,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:19,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:19,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:19,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:42:19,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:19,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:42:20,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:42:20,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 13:42:20,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 13:42:20,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 13:42:20,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:42:20,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 13:42:20,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 13:42:21,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:21,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:21,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:22,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:22,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:42:22,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:22,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:22,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:22,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:22,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:22,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:42:22,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:23,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:42:23,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 13:42:23,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:24,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:24,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 13:42:24,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 13:42:25,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:42:25,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:42:25,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 13:42:25,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 13:42:25,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:26,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:42:26,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:26,484 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 13:42:26,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 13:42:26,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:42:26,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:26,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 13:42:26,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 13:42:27,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 13:42:27,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:40,294 INFO [train.py:1039] (2/4) Epoch 2, batch 0, loss[loss=0.3178, simple_loss=0.3494, pruned_loss=0.1431, over 24482.00 frames. ], tot_loss[loss=0.3178, simple_loss=0.3494, pruned_loss=0.1431, over 24482.00 frames. ], batch size: 58, lr: 3.99e-02, grad_scale: 32.0 2023-09-28 13:42:40,294 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 13:42:56,267 INFO [train.py:1071] (2/4) Epoch 2, validation: loss=0.367, simple_loss=0.3421, pruned_loss=0.196, over 1125622.00 frames. 2023-09-28 13:42:56,268 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 13:42:56,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff3.min_abs, batch_count=35413.333333333336, ans=0.2 2023-09-28 13:42:57,795 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.107e+02 3.100e+02 3.616e+02 4.753e+02 9.571e+02, threshold=7.232e+02, percent-clipped=1.0 2023-09-28 13:42:59,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 13:42:59,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:43:02,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:43:07,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:07,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:43:07,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:08,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 13:43:10,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 13:43:13,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:14,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:19,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:43:19,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:22,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 13:43:23,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=35480.0, ans=0.125 2023-09-28 13:43:24,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:33,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:43:33,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:35,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 13:43:39,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=35546.666666666664, ans=10.0 2023-09-28 13:43:41,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:43:41,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:43:44,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:43:46,982 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.96 vs. limit=10.0 2023-09-28 13:43:49,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:43:53,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:43:55,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=35613.333333333336, ans=0.125 2023-09-28 13:44:00,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 13:44:02,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 13:44:02,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:02,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:03,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:44:05,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:05,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 13:44:08,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:10,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:10,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.13 vs. limit=22.5 2023-09-28 13:44:15,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:44:19,229 INFO [train.py:1039] (2/4) Epoch 2, batch 50, loss[loss=0.2894, simple_loss=0.3463, pruned_loss=0.1163, over 24492.00 frames. ], tot_loss[loss=0.3419, simple_loss=0.3732, pruned_loss=0.1554, over 1052860.87 frames. ], batch size: 63, lr: 3.98e-02, grad_scale: 32.0 2023-09-28 13:44:19,290 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 13:44:19,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:44:22,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:24,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:24,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 13:44:25,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:44:25,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:44:30,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:32,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:32,821 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.47 vs. limit=15.0 2023-09-28 13:44:35,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:35,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=35813.333333333336, ans=0.0 2023-09-28 13:44:36,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 13:44:36,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:40,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=35813.333333333336, ans=0.003084057971014492 2023-09-28 13:44:43,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:44:44,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 13:44:46,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 13:44:49,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:44:52,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:44:52,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:52,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:54,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:44:54,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:44:54,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:45:02,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:05,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:05,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:45:05,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 13:45:07,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:45:08,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:45:08,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 13:45:10,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:11,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 13:45:12,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=35946.666666666664, ans=0.125 2023-09-28 13:45:17,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=35946.666666666664, ans=0.0030550724637681166 2023-09-28 13:45:18,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:45:18,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:22,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:22,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:22,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:26,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 13:45:27,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 13:45:27,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:29,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:30,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:45:30,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:30,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 13:45:30,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 13:45:32,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:45:32,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:33,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:45:35,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 13:45:35,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 13:45:35,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:37,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:38,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:45:38,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:45:42,081 INFO [train.py:1039] (2/4) Epoch 2, batch 100, loss[loss=0.359, simple_loss=0.3796, pruned_loss=0.1692, over 23852.00 frames. ], tot_loss[loss=0.3373, simple_loss=0.3703, pruned_loss=0.1521, over 1880265.93 frames. ], batch size: 195, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:45:43,570 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.251e+02 2.783e+02 3.462e+02 4.523e+02 1.049e+03, threshold=6.924e+02, percent-clipped=4.0 2023-09-28 13:45:43,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:45:45,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:45:48,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:45:50,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 13:45:50,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:52,627 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.19 vs. limit=15.0 2023-09-28 13:45:56,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:45:56,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:56,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:56,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:56,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=36080.0, ans=0.125 2023-09-28 13:45:57,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:59,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 13:46:00,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:46:00,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:02,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:02,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:46:05,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 13:46:07,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:08,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:09,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=36146.666666666664, ans=0.1 2023-09-28 13:46:10,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:46:10,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=36146.666666666664, ans=0.125 2023-09-28 13:46:12,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:46:15,734 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 13:46:15,773 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 13:46:15,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:46:15,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:46:20,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:46:20,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=36213.333333333336, ans=0.025 2023-09-28 13:46:22,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:22,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=36213.333333333336, ans=0.125 2023-09-28 13:46:23,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:31,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:32,828 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 13:46:34,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:46:39,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:46:41,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:46:42,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:45,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:48,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:46:49,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:46:52,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:53,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=36346.666666666664, ans=0.125 2023-09-28 13:46:54,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:54,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:55,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:46:55,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:57,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 13:46:57,316 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 13:46:57,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:58,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:47:01,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:01,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:01,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:47:01,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:47:01,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:47:01,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:03,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:05,404 INFO [train.py:1039] (2/4) Epoch 2, batch 150, loss[loss=0.314, simple_loss=0.3394, pruned_loss=0.1443, over 23511.00 frames. ], tot_loss[loss=0.3353, simple_loss=0.3679, pruned_loss=0.1513, over 2513775.63 frames. ], batch size: 134, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:47:05,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:05,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:47:06,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:47:08,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:14,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:47:14,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:14,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:17,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:47:17,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:20,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:47:20,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:26,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 13:47:26,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=36480.0, ans=0.2 2023-09-28 13:47:27,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 13:47:27,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 13:47:27,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=36480.0, ans=0.1 2023-09-28 13:47:30,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:47:30,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:47:31,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:47:33,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:47:33,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:33,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:33,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:34,954 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 13:47:36,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=36546.666666666664, ans=0.0 2023-09-28 13:47:37,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:44,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:47,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:47:50,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 13:47:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:47:56,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:56,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:47:59,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:48:01,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:48:01,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=36613.333333333336, ans=0.002910144927536231 2023-09-28 13:48:02,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:48:04,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:06,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 13:48:10,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=36613.333333333336, ans=0.125 2023-09-28 13:48:11,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:11,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:11,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:48:11,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:48:14,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:16,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 13:48:19,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:48:21,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:48:21,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:23,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=36680.0, ans=0.125 2023-09-28 13:48:24,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:48:24,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 13:48:24,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:48:25,030 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 13:48:30,006 INFO [train.py:1039] (2/4) Epoch 2, batch 200, loss[loss=0.348, simple_loss=0.368, pruned_loss=0.164, over 23252.00 frames. ], tot_loss[loss=0.333, simple_loss=0.3675, pruned_loss=0.1493, over 3013464.48 frames. ], batch size: 119, lr: 3.96e-02, grad_scale: 32.0 2023-09-28 13:48:30,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:31,372 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.151e+02 2.761e+02 3.224e+02 4.160e+02 8.294e+02, threshold=6.447e+02, percent-clipped=1.0 2023-09-28 13:48:33,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:48:34,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:48:38,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 13:48:39,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:40,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:42,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 13:48:44,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:48:45,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:45,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:47,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=36813.333333333336, ans=0.125 2023-09-28 13:48:50,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:48:52,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:52,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:10,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:49:10,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:49:12,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:49:13,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:49:13,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:49:13,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:49:13,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:15,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:49:15,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:17,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:18,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 13:49:19,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:49:19,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:23,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:49:25,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.68 vs. limit=22.5 2023-09-28 13:49:27,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:30,887 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.13 vs. limit=15.0 2023-09-28 13:49:34,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.03 vs. limit=22.5 2023-09-28 13:49:34,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:35,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:49:44,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:45,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 13:49:47,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:47,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:49:47,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:47,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:49:50,246 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.70 vs. limit=22.5 2023-09-28 13:49:51,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 13:49:52,427 INFO [train.py:1039] (2/4) Epoch 2, batch 250, loss[loss=0.3358, simple_loss=0.3502, pruned_loss=0.1607, over 23668.00 frames. ], tot_loss[loss=0.3304, simple_loss=0.3648, pruned_loss=0.1479, over 3395832.69 frames. ], batch size: 232, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:49:52,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:49:52,549 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 13:49:54,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:55,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:49:57,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:57,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:50:00,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:50:01,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:50:02,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=37080.0, ans=0.125 2023-09-28 13:50:03,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:50:08,366 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.48 vs. limit=15.0 2023-09-28 13:50:09,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:50:21,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:24,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:50:24,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:50:31,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:50:33,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:50:34,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:50:34,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:36,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:50:36,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:50:36,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:37,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:50:40,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 13:50:40,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:43,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:50:43,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:50:43,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:50:45,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:50:46,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:50:46,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:50:48,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.18 vs. limit=15.0 2023-09-28 13:50:49,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:50:51,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:50:52,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:50:52,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=37280.0, ans=0.125 2023-09-28 13:50:57,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:51:01,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:01,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=37346.666666666664, ans=0.95 2023-09-28 13:51:01,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=37346.666666666664, ans=0.125 2023-09-28 13:51:02,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:51:08,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:11,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:51:14,098 INFO [train.py:1039] (2/4) Epoch 2, batch 300, loss[loss=0.317, simple_loss=0.3391, pruned_loss=0.1475, over 23828.00 frames. ], tot_loss[loss=0.3279, simple_loss=0.3621, pruned_loss=0.1469, over 3666759.48 frames. ], batch size: 212, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:51:14,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 13:51:15,696 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.217e+02 3.009e+02 3.543e+02 4.126e+02 1.008e+03, threshold=7.086e+02, percent-clipped=8.0 2023-09-28 13:51:15,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:51:15,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:51:18,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 13:51:18,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:51:19,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:51:19,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 13:51:19,766 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=1.424e-02 2023-09-28 13:51:23,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:24,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:51:29,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:51:29,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 13:51:31,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:31,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:51:31,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 13:51:32,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:39,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:51:41,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=37480.0, ans=0.0 2023-09-28 13:51:42,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:51:42,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 13:51:44,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=37480.0, ans=0.0 2023-09-28 13:51:45,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 13:51:47,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:49,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:51,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:51,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 13:51:51,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:51:55,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:51:55,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=37546.666666666664, ans=0.125 2023-09-28 13:51:56,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.56 vs. limit=22.5 2023-09-28 13:51:57,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:51:57,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:52:01,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:52:01,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 13:52:01,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=37546.666666666664, ans=0.1 2023-09-28 13:52:04,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:52:07,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:10,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 13:52:10,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:16,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:52:17,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:52:17,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 13:52:22,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:22,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:52:25,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:25,639 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:52:27,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:52:27,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 13:52:29,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:52:29,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:52:30,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 13:52:32,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:33,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:33,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:34,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.13 vs. limit=22.5 2023-09-28 13:52:35,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:36,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:37,453 INFO [train.py:1039] (2/4) Epoch 2, batch 350, loss[loss=0.3482, simple_loss=0.3724, pruned_loss=0.162, over 23887.00 frames. ], tot_loss[loss=0.3258, simple_loss=0.36, pruned_loss=0.1458, over 3894723.37 frames. ], batch size: 195, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:52:41,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:41,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:52:43,858 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.87 vs. limit=12.0 2023-09-28 13:52:44,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:51,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:54,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:54,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:56,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 13:52:57,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:57,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 13:52:59,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:01,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 13:53:03,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:06,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 13:53:06,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=37813.333333333336, ans=0.05 2023-09-28 13:53:08,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:53:10,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:12,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:53:13,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:13,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:15,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:15,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:15,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:53:16,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=37880.0, ans=0.1 2023-09-28 13:53:17,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=37880.0, ans=0.1 2023-09-28 13:53:18,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:53:18,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:25,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:53:25,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:53:26,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:53:26,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:30,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=37946.666666666664, ans=0.125 2023-09-28 13:53:31,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 13:53:31,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:38,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:38,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:38,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:53:39,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 13:53:42,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:42,879 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 13:53:44,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 13:53:45,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:45,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=38013.333333333336, ans=0.1 2023-09-28 13:53:45,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.00 vs. limit=22.5 2023-09-28 13:53:48,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:48,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 13:53:49,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:51,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:53:55,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:55,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=38013.333333333336, ans=0.1 2023-09-28 13:53:56,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:56,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:58,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:54:01,044 INFO [train.py:1039] (2/4) Epoch 2, batch 400, loss[loss=0.3575, simple_loss=0.3803, pruned_loss=0.1674, over 23389.00 frames. ], tot_loss[loss=0.3258, simple_loss=0.3595, pruned_loss=0.1461, over 4065328.09 frames. ], batch size: 119, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:54:01,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:54:02,561 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.905e+02 3.509e+02 4.327e+02 7.986e+02, threshold=7.018e+02, percent-clipped=1.0 2023-09-28 13:54:05,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:54:07,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 13:54:07,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:07,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:09,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:54:09,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:12,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:14,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:17,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 13:54:18,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 13:54:18,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:20,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 13:54:22,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:24,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:54:24,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 13:54:26,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:54:26,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:26,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:30,351 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 13:54:30,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 13:54:32,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=38146.666666666664, ans=0.0 2023-09-28 13:54:35,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:36,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:38,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 13:54:40,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 13:54:42,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:54:42,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=38213.333333333336, ans=0.0 2023-09-28 13:54:45,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:54:51,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 13:54:53,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:54:55,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 13:54:59,706 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.75 vs. limit=15.0 2023-09-28 13:55:00,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:55:01,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:55:01,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 13:55:05,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:55:07,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=38346.666666666664, ans=0.0 2023-09-28 13:55:08,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:55:10,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:55:13,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:13,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=38346.666666666664, ans=0.025 2023-09-28 13:55:15,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 13:55:17,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:55:18,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 13:55:18,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:55:20,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:55:23,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 13:55:24,947 INFO [train.py:1039] (2/4) Epoch 2, batch 450, loss[loss=0.3442, simple_loss=0.3721, pruned_loss=0.1582, over 23755.00 frames. ], tot_loss[loss=0.3276, simple_loss=0.361, pruned_loss=0.1471, over 4206964.81 frames. ], batch size: 164, lr: 3.93e-02, grad_scale: 32.0 2023-09-28 13:55:25,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:55:26,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:55:26,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:55:30,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 13:55:30,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:55:31,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:55:31,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:55:31,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 13:55:31,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:55:33,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:55:37,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:55:47,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:47,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:55:51,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 13:55:51,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 13:55:55,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:55:57,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=38546.666666666664, ans=0.07 2023-09-28 13:55:58,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:58,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:03,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:04,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:08,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 13:56:08,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 13:56:09,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 13:56:10,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:12,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:12,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:56:15,226 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 13:56:15,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 13:56:15,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:56:18,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:56:19,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=38613.333333333336, ans=0.1 2023-09-28 13:56:19,904 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.28 vs. limit=10.0 2023-09-28 13:56:20,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:56:23,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:56:23,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:56:25,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 13:56:25,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 13:56:27,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:30,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:56:30,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:56:32,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 13:56:32,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=38680.0, ans=0.125 2023-09-28 13:56:36,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:56:37,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 13:56:38,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 13:56:39,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:45,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:56:45,970 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.92 vs. limit=15.0 2023-09-28 13:56:46,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:56:48,166 INFO [train.py:1039] (2/4) Epoch 2, batch 500, loss[loss=0.3886, simple_loss=0.3955, pruned_loss=0.1909, over 23449.00 frames. ], tot_loss[loss=0.327, simple_loss=0.3613, pruned_loss=0.1464, over 4332885.17 frames. ], batch size: 285, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:56:48,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:56:48,342 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 13:56:50,427 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.109e+02 2.855e+02 3.493e+02 4.304e+02 8.305e+02, threshold=6.986e+02, percent-clipped=1.0 2023-09-28 13:56:53,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:53,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:56:55,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:55,151 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 13:56:56,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 13:56:56,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:57:00,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:57:05,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:57:06,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:57:06,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=38813.333333333336, ans=0.1 2023-09-28 13:57:08,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:57:08,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:57:09,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:12,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=38813.333333333336, ans=0.125 2023-09-28 13:57:13,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.37 vs. limit=10.0 2023-09-28 13:57:18,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:18,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 13:57:19,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:57:19,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:21,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 13:57:21,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:57:25,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:57:25,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:57:26,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:57:26,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:28,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 13:57:33,678 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 13:57:34,043 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:57:38,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:57:38,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:39,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:57:43,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 13:57:46,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:57:47,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:57:49,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=38946.666666666664, ans=0.125 2023-09-28 13:57:50,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:57:53,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:57,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:01,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 13:58:01,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:01,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:06,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 13:58:08,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:58:09,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:11,444 INFO [train.py:1039] (2/4) Epoch 2, batch 550, loss[loss=0.3384, simple_loss=0.3871, pruned_loss=0.1449, over 24101.00 frames. ], tot_loss[loss=0.3285, simple_loss=0.3634, pruned_loss=0.1468, over 4422183.50 frames. ], batch size: 80, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:58:14,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 13:58:16,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 13:58:16,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:16,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 13:58:17,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:58:17,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:19,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:19,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:20,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:58:20,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:58:22,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:22,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=39080.0, ans=0.0 2023-09-28 13:58:23,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 13:58:23,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:58:29,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:29,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=39146.666666666664, ans=0.125 2023-09-28 13:58:29,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=39146.666666666664, ans=0.125 2023-09-28 13:58:30,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:30,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:58:31,408 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.06 vs. limit=15.0 2023-09-28 13:58:32,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:35,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 13:58:37,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 13:58:38,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:58:43,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:58:43,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:44,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:58:47,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:47,885 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 13:58:48,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=39213.333333333336, ans=0.125 2023-09-28 13:58:49,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:50,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 13:58:52,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:52,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=39213.333333333336, ans=0.125 2023-09-28 13:58:52,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=39213.333333333336, ans=0.2 2023-09-28 13:58:54,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:58:54,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:58:55,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:56,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 13:58:59,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 13:58:59,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:59,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:59:01,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:59:01,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:59:03,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=39280.0, ans=0.1 2023-09-28 13:59:03,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=39280.0, ans=0.0 2023-09-28 13:59:04,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:59:06,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:59:11,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:59:14,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:14,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:59:15,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:59:17,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:18,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:59:20,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:22,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:59:22,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:59:22,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.27 vs. limit=15.0 2023-09-28 13:59:29,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 13:59:32,874 INFO [train.py:1039] (2/4) Epoch 2, batch 600, loss[loss=0.2798, simple_loss=0.3242, pruned_loss=0.1177, over 16085.00 frames. ], tot_loss[loss=0.3299, simple_loss=0.3647, pruned_loss=0.1475, over 4466233.71 frames. ], batch size: 35, lr: 3.91e-02, grad_scale: 32.0 2023-09-28 13:59:33,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 13:59:34,968 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.924e+02 3.724e+02 4.722e+02 8.175e+02, threshold=7.448e+02, percent-clipped=4.0 2023-09-28 13:59:36,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:59:36,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:59:36,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:44,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:59:45,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:59:47,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 13:59:49,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:59:52,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:59:54,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:55,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 13:59:55,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:00:00,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=39480.0, ans=0.125 2023-09-28 14:00:02,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=39480.0, ans=0.1 2023-09-28 14:00:04,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 14:00:07,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:00:07,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:08,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:00:13,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:00:13,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:00:13,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:20,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:00:21,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=39546.666666666664, ans=0.2 2023-09-28 14:00:27,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:27,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:00:27,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:34,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 14:00:39,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:00:39,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:00:39,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=39680.0, ans=0.1 2023-09-28 14:00:44,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 14:00:46,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:00:46,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.77 vs. limit=15.0 2023-09-28 14:00:50,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 14:00:50,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:00:50,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:00:57,159 INFO [train.py:1039] (2/4) Epoch 2, batch 650, loss[loss=0.3183, simple_loss=0.3645, pruned_loss=0.1361, over 24465.00 frames. ], tot_loss[loss=0.328, simple_loss=0.3635, pruned_loss=0.1462, over 4524749.41 frames. ], batch size: 69, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:00:57,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:01:00,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:01:01,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:03,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:01:04,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:07,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 14:01:08,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=39746.666666666664, ans=0.0 2023-09-28 14:01:09,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:01:14,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:01:14,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:17,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:19,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=39813.333333333336, ans=0.0 2023-09-28 14:01:22,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 14:01:23,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=39813.333333333336, ans=0.0 2023-09-28 14:01:25,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:25,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:30,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:01:31,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:01:32,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:33,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:34,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:01:36,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:38,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:01:39,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:01:40,951 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 14:01:40,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:40,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:42,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:44,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:44,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:01:45,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:01:47,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 14:01:48,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:01:49,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:49,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:01:49,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:51,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:01:52,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 14:01:54,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 14:01:54,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:54,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:56,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:01:56,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:58,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:02:02,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=40013.333333333336, ans=0.125 2023-09-28 14:02:05,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:05,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:08,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:02:10,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:10,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:02:10,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=40013.333333333336, ans=0.1 2023-09-28 14:02:10,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=40013.333333333336, ans=0.0 2023-09-28 14:02:11,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:19,116 INFO [train.py:1039] (2/4) Epoch 2, batch 700, loss[loss=0.3012, simple_loss=0.3431, pruned_loss=0.1297, over 24591.00 frames. ], tot_loss[loss=0.3277, simple_loss=0.363, pruned_loss=0.1462, over 4559695.99 frames. ], batch size: 60, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:02:19,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:02:19,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:20,670 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.820e+02 3.434e+02 4.210e+02 9.710e+02, threshold=6.868e+02, percent-clipped=2.0 2023-09-28 14:02:20,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:20,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:24,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 14:02:26,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 14:02:29,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 14:02:29,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:32,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:02:35,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 14:02:38,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:42,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:02:43,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:45,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:02:46,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:48,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=40146.666666666664, ans=0.125 2023-09-28 14:02:49,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:51,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:02:51,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:02:54,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 14:02:54,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=40213.333333333336, ans=0.0021275362318840564 2023-09-28 14:02:57,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 14:03:01,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:03:01,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:03:03,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:03:08,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:03:08,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 14:03:14,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:14,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=40280.0, ans=0.07 2023-09-28 14:03:15,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:03:15,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 14:03:18,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=40280.0, ans=0.0021130434782608704 2023-09-28 14:03:21,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:03:21,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:24,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:03:26,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=40346.666666666664, ans=0.125 2023-09-28 14:03:28,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=40346.666666666664, ans=0.1 2023-09-28 14:03:29,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:03:29,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 14:03:35,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 14:03:35,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 14:03:38,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:40,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:40,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:03:40,870 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.10 vs. limit=22.5 2023-09-28 14:03:41,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:41,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 14:03:43,880 INFO [train.py:1039] (2/4) Epoch 2, batch 750, loss[loss=0.362, simple_loss=0.3715, pruned_loss=0.1763, over 22838.00 frames. ], tot_loss[loss=0.326, simple_loss=0.362, pruned_loss=0.145, over 4593646.91 frames. ], batch size: 322, lr: 3.89e-02, grad_scale: 32.0 2023-09-28 14:03:45,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 14:03:47,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 14:03:47,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 14:03:48,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 14:03:48,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 14:03:48,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:03:49,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=40413.333333333336, ans=0.025 2023-09-28 14:03:51,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 14:03:53,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:53,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:03:54,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:03:56,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:56,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:03:57,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:59,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:04:00,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:04:03,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:04:05,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:05,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:07,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 14:04:09,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:04:11,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:14,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:14,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:04:16,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 14:04:16,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:04:19,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 14:04:19,645 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 14:04:19,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 14:04:21,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:04:21,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:04:22,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:04:28,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:04:28,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:28,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:04:32,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:34,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:04:35,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 14:04:35,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:04:36,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 14:04:36,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:04:40,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:04:42,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 14:04:44,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:48,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:04:50,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:04:51,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:54,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:04:57,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 14:04:57,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:04:59,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:02,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:02,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:05,503 INFO [train.py:1039] (2/4) Epoch 2, batch 800, loss[loss=0.3402, simple_loss=0.3828, pruned_loss=0.1487, over 24049.00 frames. ], tot_loss[loss=0.3264, simple_loss=0.3625, pruned_loss=0.1451, over 4617189.67 frames. ], batch size: 80, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:05:05,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:05,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:05:07,072 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.053e+02 2.783e+02 3.464e+02 4.160e+02 6.985e+02, threshold=6.929e+02, percent-clipped=3.0 2023-09-28 14:05:14,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:14,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:17,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:05:17,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:18,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:20,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:22,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:26,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:27,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:05:30,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 14:05:31,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:31,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:33,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:05:33,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:34,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 14:05:34,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:34,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 14:05:37,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:39,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:41,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:41,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:45,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:45,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:49,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=40880.0, ans=0.1 2023-09-28 14:05:52,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:05:52,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:05:52,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 14:05:53,858 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 14:05:53,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 14:05:53,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:05:53,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:57,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:57,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:06:03,297 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 14:06:03,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 14:06:06,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:06:07,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:06:11,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:06:15,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:15,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 14:06:17,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:06:19,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 14:06:21,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.00 vs. limit=22.5 2023-09-28 14:06:22,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=41013.333333333336, ans=0.07 2023-09-28 14:06:25,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:27,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:06:27,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 14:06:27,743 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:06:29,438 INFO [train.py:1039] (2/4) Epoch 2, batch 850, loss[loss=0.3227, simple_loss=0.3517, pruned_loss=0.1469, over 23730.00 frames. ], tot_loss[loss=0.3284, simple_loss=0.364, pruned_loss=0.1464, over 4636490.67 frames. ], batch size: 232, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:06:29,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:06:29,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:31,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 14:06:31,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:31,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:06:33,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:33,906 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:06:37,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:06:37,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:06:38,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 14:06:38,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=41080.0, ans=0.0019391304347826082 2023-09-28 14:06:40,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 14:06:40,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 14:06:41,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:41,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:06:44,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:44,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:46,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:06:50,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:50,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:50,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 14:06:51,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=41146.666666666664, ans=0.1 2023-09-28 14:06:54,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 14:06:59,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:07:00,220 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:07:01,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 14:07:05,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 14:07:05,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 14:07:08,423 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 14:07:08,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:08,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:07:08,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:07:08,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=41213.333333333336, ans=0.125 2023-09-28 14:07:09,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.12 vs. limit=22.5 2023-09-28 14:07:11,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:12,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:12,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 14:07:17,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:19,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:20,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:07:20,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:07:21,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:07:22,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:07:22,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 14:07:28,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:07:28,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:28,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:07:30,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:32,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:34,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:37,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:07:39,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:07:39,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:07:41,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:07:41,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=41346.666666666664, ans=0.125 2023-09-28 14:07:49,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:07:49,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:51,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 14:07:51,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:07:51,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:53,224 INFO [train.py:1039] (2/4) Epoch 2, batch 900, loss[loss=0.3777, simple_loss=0.389, pruned_loss=0.1832, over 22781.00 frames. ], tot_loss[loss=0.3294, simple_loss=0.3653, pruned_loss=0.1467, over 4658447.50 frames. ], batch size: 322, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:07:54,703 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.862e+02 3.366e+02 4.167e+02 7.237e+02, threshold=6.733e+02, percent-clipped=1.0 2023-09-28 14:07:54,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 14:07:59,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:08:02,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:02,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 14:08:06,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:08:07,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 14:08:09,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:08:09,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:08:09,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:10,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:08:11,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:08:16,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=41480.0, ans=0.0 2023-09-28 14:08:24,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:24,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:24,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:08:28,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:31,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 14:08:33,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:08:33,800 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.40 vs. limit=22.5 2023-09-28 14:08:39,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:08:40,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:08:42,579 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 14:08:42,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 14:08:47,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:08:47,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:08:49,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:08:52,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=41613.333333333336, ans=0.125 2023-09-28 14:08:58,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:58,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:08:59,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 14:08:59,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:09:04,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 14:09:06,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:09:06,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:07,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:09:07,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:09,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 14:09:11,121 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 14:09:14,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:09:14,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 14:09:16,241 INFO [train.py:1039] (2/4) Epoch 2, batch 950, loss[loss=0.3244, simple_loss=0.3412, pruned_loss=0.1538, over 23423.00 frames. ], tot_loss[loss=0.3293, simple_loss=0.3646, pruned_loss=0.147, over 4668886.82 frames. ], batch size: 285, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:09:17,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:21,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 14:09:26,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:30,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:30,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:31,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:09:33,437 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 14:09:37,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:38,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:40,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:40,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:09:40,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 14:09:41,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:09:43,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:43,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=41813.333333333336, ans=0.0 2023-09-28 14:09:44,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 14:09:46,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:50,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:50,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:51,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:51,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 14:09:53,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:09:55,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:57,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:10:02,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:02,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:10:03,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=41880.0, ans=0.0017652173913043478 2023-09-28 14:10:06,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 14:10:09,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:10:09,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:10:09,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:09,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:09,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:10:13,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 14:10:13,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=41946.666666666664, ans=0.125 2023-09-28 14:10:13,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=41946.666666666664, ans=0.125 2023-09-28 14:10:16,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:10:18,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:19,155 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.81 vs. limit=15.0 2023-09-28 14:10:19,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:19,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 14:10:19,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:19,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:10:21,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 14:10:26,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:10:29,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:31,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=42013.333333333336, ans=0.1 2023-09-28 14:10:35,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:37,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 14:10:37,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 14:10:39,876 INFO [train.py:1039] (2/4) Epoch 2, batch 1000, loss[loss=0.3225, simple_loss=0.3734, pruned_loss=0.1358, over 24529.00 frames. ], tot_loss[loss=0.3281, simple_loss=0.3636, pruned_loss=0.1463, over 4680420.11 frames. ], batch size: 71, lr: 3.86e-02, grad_scale: 16.0 2023-09-28 14:10:41,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:42,878 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.243e+02 2.891e+02 3.339e+02 3.802e+02 9.955e+02, threshold=6.678e+02, percent-clipped=4.0 2023-09-28 14:10:44,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 14:10:44,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:51,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:10:52,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 14:10:52,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 14:10:57,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:10:57,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:57,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=42146.666666666664, ans=0.0 2023-09-28 14:10:59,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:02,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=42146.666666666664, ans=0.125 2023-09-28 14:11:03,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 14:11:06,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 14:11:09,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 14:11:09,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:11,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 14:11:13,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 14:11:13,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 14:11:14,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:16,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:21,787 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.29 vs. limit=15.0 2023-09-28 14:11:24,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:24,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:11:26,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:27,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:27,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 14:11:27,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:29,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:11:29,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:31,388 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 14:11:34,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 14:11:36,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 14:11:38,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 14:11:39,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:11:46,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:46,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:11:46,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:48,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:11:49,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 14:11:51,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:11:51,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 14:11:51,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 14:11:52,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:11:52,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:54,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:11:58,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:11:59,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:03,236 INFO [train.py:1039] (2/4) Epoch 2, batch 1050, loss[loss=0.3421, simple_loss=0.3657, pruned_loss=0.1593, over 23803.00 frames. ], tot_loss[loss=0.3251, simple_loss=0.3611, pruned_loss=0.1445, over 4686204.89 frames. ], batch size: 195, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:12:04,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.69 vs. limit=6.0 2023-09-28 14:12:04,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:12:06,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:12:06,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=42413.333333333336, ans=0.1 2023-09-28 14:12:08,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:12:09,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:11,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:13,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:12:15,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:12:16,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:12:18,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:12:18,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:12:20,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:12:20,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 14:12:21,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:21,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 14:12:24,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:12:24,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 14:12:24,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:12:31,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:33,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:12:35,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:38,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 14:12:38,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 14:12:39,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:42,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 14:12:44,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=42546.666666666664, ans=0.125 2023-09-28 14:12:45,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 14:12:45,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=42546.666666666664, ans=0.0 2023-09-28 14:12:46,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:49,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:12:53,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:12:53,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:12:55,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:12:58,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:13:03,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 14:13:03,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=42613.333333333336, ans=0.125 2023-09-28 14:13:04,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 14:13:04,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 14:13:06,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:06,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:13:08,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 14:13:08,387 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:13:12,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:13:13,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=42680.0, ans=0.0 2023-09-28 14:13:16,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:16,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:17,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:17,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:20,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:20,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 14:13:22,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:22,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 14:13:22,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 14:13:24,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:13:27,015 INFO [train.py:1039] (2/4) Epoch 2, batch 1100, loss[loss=0.2812, simple_loss=0.3265, pruned_loss=0.1179, over 24263.00 frames. ], tot_loss[loss=0.324, simple_loss=0.3595, pruned_loss=0.1443, over 4682898.22 frames. ], batch size: 56, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:13:28,136 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.97 vs. limit=10.0 2023-09-28 14:13:29,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:13:30,547 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.833e+02 3.199e+02 3.709e+02 7.263e+02, threshold=6.397e+02, percent-clipped=1.0 2023-09-28 14:13:33,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:13:39,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:13:41,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:13:41,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:41,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 14:13:42,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.10 vs. limit=15.0 2023-09-28 14:13:42,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:13:47,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:13:50,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:13:51,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=42813.333333333336, ans=0.125 2023-09-28 14:13:53,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:13:53,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 14:13:55,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:13:56,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:56,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:57,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=42813.333333333336, ans=0.125 2023-09-28 14:13:59,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:14:00,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:14:03,240 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:14:06,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:14:08,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 14:14:09,814 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 14:14:09,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:13,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:14,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:14:14,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:14:16,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 14:14:17,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:14:17,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:14:17,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:14:19,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:19,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 14:14:27,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:14:27,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 14:14:27,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=42946.666666666664, ans=0.1 2023-09-28 14:14:28,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:14:34,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:14:38,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 14:14:38,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:14:38,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:42,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:14:42,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:44,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 14:14:44,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:14:45,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:47,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 14:14:47,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:14:47,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 14:14:48,585 INFO [train.py:1039] (2/4) Epoch 2, batch 1150, loss[loss=0.3173, simple_loss=0.3598, pruned_loss=0.1374, over 23495.00 frames. ], tot_loss[loss=0.3245, simple_loss=0.3606, pruned_loss=0.1442, over 4701973.22 frames. ], batch size: 93, lr: 3.84e-02, grad_scale: 16.0 2023-09-28 14:14:48,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:14:48,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:14:50,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:14:55,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:14:59,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:15:00,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:01,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:15:01,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 14:15:02,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:03,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=43080.0, ans=0.125 2023-09-28 14:15:04,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 14:15:05,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:05,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:15:10,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 14:15:12,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:17,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:18,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:18,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 14:15:18,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:15:18,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:21,025 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.81 vs. limit=6.0 2023-09-28 14:15:21,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 14:15:23,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:25,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:35,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:42,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:43,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 14:15:44,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:45,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:53,303 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 14:15:54,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:16:01,814 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 14:16:06,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:08,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:16:08,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:16:09,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:16:11,795 INFO [train.py:1039] (2/4) Epoch 2, batch 1200, loss[loss=0.3438, simple_loss=0.3603, pruned_loss=0.1637, over 23817.00 frames. ], tot_loss[loss=0.3253, simple_loss=0.3612, pruned_loss=0.1447, over 4702295.26 frames. ], batch size: 212, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:16:13,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:14,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-09-28 14:16:15,000 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.963e+02 2.991e+02 3.527e+02 4.351e+02 6.174e+02, threshold=7.053e+02, percent-clipped=0.0 2023-09-28 14:16:18,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:16:18,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:16:19,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:19,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:21,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:16:21,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:16:25,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:16:25,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=43413.333333333336, ans=0.2 2023-09-28 14:16:26,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:26,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:16:28,530 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 14:16:32,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 14:16:36,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:16:38,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:16:40,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:44,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:16:44,851 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 14:16:44,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:54,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:16:54,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:16:54,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 14:16:56,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:16:59,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 14:17:04,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 14:17:04,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:17:06,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:17:07,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:07,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:17:08,398 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.22 vs. limit=15.0 2023-09-28 14:17:09,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:17:09,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:17:12,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:17:12,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 14:17:14,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:17:14,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:14,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:17:17,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:17,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:21,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:17:24,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:17:27,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 14:17:31,162 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 14:17:34,022 INFO [train.py:1039] (2/4) Epoch 2, batch 1250, loss[loss=0.3268, simple_loss=0.3562, pruned_loss=0.1487, over 23784.00 frames. ], tot_loss[loss=0.3253, simple_loss=0.3615, pruned_loss=0.1445, over 4720232.78 frames. ], batch size: 164, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:17:34,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:17:35,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:37,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:17:39,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:42,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 14:17:47,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:17:47,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:47,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 14:17:49,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:17:51,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:17:54,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:17:55,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:56,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:17:56,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:17:58,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:18:02,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:18:02,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:18:02,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:04,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:18:05,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:09,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:10,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:18:18,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 14:18:19,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:18:21,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:21,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 14:18:23,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:18:23,318 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 14:18:23,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:23,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:29,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:33,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:33,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:18:36,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 14:18:36,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 14:18:36,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 14:18:39,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:18:41,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 14:18:42,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:44,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:18:44,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:18:46,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 14:18:46,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:18:46,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:18:46,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:18:47,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:50,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 14:18:53,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:55,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:18:56,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:18:58,472 INFO [train.py:1039] (2/4) Epoch 2, batch 1300, loss[loss=0.309, simple_loss=0.3554, pruned_loss=0.1313, over 24467.00 frames. ], tot_loss[loss=0.3256, simple_loss=0.3621, pruned_loss=0.1446, over 4716555.67 frames. ], batch size: 66, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:18:58,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:19:01,542 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.943e+02 3.508e+02 4.700e+02 1.321e+03, threshold=7.016e+02, percent-clipped=7.0 2023-09-28 14:19:01,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:19:01,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 14:19:07,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:08,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:19:08,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:10,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:19:11,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:19:12,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=44080.0, ans=0.2 2023-09-28 14:19:13,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 14:19:18,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:19:18,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:19:18,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=44146.666666666664, ans=0.125 2023-09-28 14:19:21,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 14:19:25,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:19:26,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=44146.666666666664, ans=0.2 2023-09-28 14:19:30,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:32,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:32,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:34,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:35,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:19:35,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:19:35,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=44213.333333333336, ans=0.0 2023-09-28 14:19:37,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 14:19:39,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=44213.333333333336, ans=0.125 2023-09-28 14:19:42,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:19:43,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:19:44,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 14:19:45,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:19:47,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:19:47,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=44280.0, ans=0.1 2023-09-28 14:19:49,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:50,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=44280.0, ans=0.2 2023-09-28 14:19:51,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 14:19:51,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:51,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 14:19:54,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:58,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:58,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:20:03,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 14:20:04,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 14:20:04,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 14:20:09,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:20:13,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 14:20:15,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:20,930 INFO [train.py:1039] (2/4) Epoch 2, batch 1350, loss[loss=0.3199, simple_loss=0.3381, pruned_loss=0.1509, over 23754.00 frames. ], tot_loss[loss=0.3242, simple_loss=0.3609, pruned_loss=0.1438, over 4716732.84 frames. ], batch size: 212, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:20:21,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 14:20:25,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:27,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:31,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:31,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:32,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:20:33,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=44413.333333333336, ans=0.95 2023-09-28 14:20:34,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:38,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:41,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 14:20:43,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:20:43,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:20:45,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 14:20:48,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:20:48,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:20:48,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 14:20:51,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 14:20:53,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 14:20:54,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:54,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 14:21:06,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=44546.666666666664, ans=0.125 2023-09-28 14:21:07,201 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=23.81 vs. limit=15.0 2023-09-28 14:21:07,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:14,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=44613.333333333336, ans=0.125 2023-09-28 14:21:16,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:16,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:17,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 14:21:22,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:24,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 14:21:24,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:21:24,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:21:27,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:21:30,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 14:21:31,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:21:38,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 14:21:39,072 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.74 vs. limit=6.0 2023-09-28 14:21:40,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 14:21:43,238 INFO [train.py:1039] (2/4) Epoch 2, batch 1400, loss[loss=0.3253, simple_loss=0.368, pruned_loss=0.1413, over 24000.00 frames. ], tot_loss[loss=0.3237, simple_loss=0.3597, pruned_loss=0.1438, over 4702580.42 frames. ], batch size: 80, lr: 3.81e-02, grad_scale: 32.0 2023-09-28 14:21:45,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 14:21:46,398 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.080e+02 2.861e+02 3.179e+02 3.709e+02 7.568e+02, threshold=6.358e+02, percent-clipped=1.0 2023-09-28 14:21:46,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:51,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:21:51,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:21:58,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 14:21:58,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 14:22:07,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:22:11,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:13,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:22:13,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:22:16,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=44880.0, ans=0.125 2023-09-28 14:22:17,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.98 vs. limit=15.0 2023-09-28 14:22:18,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:22:18,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:22:29,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:29,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:34,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 14:22:36,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:22:36,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=44946.666666666664, ans=0.125 2023-09-28 14:22:37,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:22:37,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:22:39,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:40,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:22:40,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:22:40,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:22:43,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 14:22:43,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:22:46,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=44946.666666666664, ans=0.0 2023-09-28 14:22:47,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:49,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=45013.333333333336, ans=0.0 2023-09-28 14:22:51,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:23:00,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 14:23:02,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:23:02,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:23:04,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:23:05,542 INFO [train.py:1039] (2/4) Epoch 2, batch 1450, loss[loss=0.2921, simple_loss=0.3464, pruned_loss=0.1189, over 24628.00 frames. ], tot_loss[loss=0.3215, simple_loss=0.3586, pruned_loss=0.1423, over 4712616.49 frames. ], batch size: 68, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:23:05,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:05,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:23:10,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:23:10,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:23:10,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:10,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:23:12,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=45080.0, ans=0.2 2023-09-28 14:23:16,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:18,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:23:20,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:23:20,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 14:23:21,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:23:24,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 14:23:24,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:25,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:25,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 14:23:27,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:28,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:23:28,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 14:23:28,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:30,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:23:31,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:32,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=45146.666666666664, ans=0.0010550724637681166 2023-09-28 14:23:35,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:37,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:23:37,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:23:41,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:41,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:44,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:44,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:23:44,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:45,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:23:48,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 14:23:52,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:55,780 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 14:23:57,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:23:59,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:24:00,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:02,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 14:24:04,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:06,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 14:24:07,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 14:24:09,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:13,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:13,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:24:14,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 14:24:17,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 14:24:17,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 14:24:19,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:20,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:24:22,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=45346.666666666664, ans=0.125 2023-09-28 14:24:30,452 INFO [train.py:1039] (2/4) Epoch 2, batch 1500, loss[loss=0.3011, simple_loss=0.3632, pruned_loss=0.1194, over 24646.00 frames. ], tot_loss[loss=0.3226, simple_loss=0.359, pruned_loss=0.143, over 4701889.01 frames. ], batch size: 73, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:24:30,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 14:24:30,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:24:30,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:24:33,280 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.694e+02 3.191e+02 3.911e+02 7.189e+02, threshold=6.382e+02, percent-clipped=1.0 2023-09-28 14:24:33,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:33,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:35,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:24:37,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 14:24:38,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:24:40,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:24:40,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:40,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:41,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:24:41,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=45413.333333333336, ans=0.125 2023-09-28 14:24:43,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:49,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:49,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 14:24:50,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:24:51,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:24:51,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:56,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 14:24:56,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=45480.0, ans=0.125 2023-09-28 14:24:58,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.75 vs. limit=22.5 2023-09-28 14:25:02,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 14:25:02,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=45546.666666666664, ans=0.0 2023-09-28 14:25:04,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:25:06,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 14:25:06,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=45546.666666666664, ans=0.125 2023-09-28 14:25:08,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:25:11,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:12,435 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.74 vs. limit=6.0 2023-09-28 14:25:12,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:25:13,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:25:15,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 14:25:16,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:25:16,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:17,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 14:25:17,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:22,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:25:22,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 14:25:30,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:25:31,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:25:35,126 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 14:25:35,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:35,225 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 14:25:38,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:25:40,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:25:40,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=45680.0, ans=0.0 2023-09-28 14:25:41,982 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 14:25:42,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:25:45,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 14:25:46,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:49,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=45680.0, ans=0.0 2023-09-28 14:25:50,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:51,697 INFO [train.py:1039] (2/4) Epoch 2, batch 1550, loss[loss=0.2792, simple_loss=0.3336, pruned_loss=0.1125, over 24340.00 frames. ], tot_loss[loss=0.3222, simple_loss=0.359, pruned_loss=0.1427, over 4703985.96 frames. ], batch size: 61, lr: 3.79e-02, grad_scale: 32.0 2023-09-28 14:25:51,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:51,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:52,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=45746.666666666664, ans=0.125 2023-09-28 14:25:53,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:53,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:54,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 14:25:56,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 14:25:56,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:25:57,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 14:25:59,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 14:26:01,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:02,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:04,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:04,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:26:06,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:06,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:09,478 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 14:26:09,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:09,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=45813.333333333336, ans=0.07 2023-09-28 14:26:10,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:26:10,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:26:11,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=45813.333333333336, ans=0.0 2023-09-28 14:26:14,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:26:14,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 14:26:16,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:18,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 14:26:18,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 14:26:18,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 14:26:18,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:19,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:24,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:26:25,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=45880.0, ans=10.0 2023-09-28 14:26:26,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=45880.0, ans=0.0008956521739130439 2023-09-28 14:26:27,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 14:26:27,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 14:26:28,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=45880.0, ans=0.125 2023-09-28 14:26:35,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:38,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:38,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:26:38,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:26:38,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 14:26:45,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:26:45,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:49,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:26:51,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:26:52,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=45946.666666666664, ans=0.2 2023-09-28 14:26:53,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:53,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 14:26:53,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=45946.666666666664, ans=0.125 2023-09-28 14:26:54,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:26:55,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=45946.666666666664, ans=0.0008811594202898562 2023-09-28 14:26:56,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:26:57,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:58,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:26:58,518 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 14:27:00,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:01,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.02 vs. limit=15.0 2023-09-28 14:27:03,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=46013.333333333336, ans=0.025 2023-09-28 14:27:06,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 14:27:12,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:14,216 INFO [train.py:1039] (2/4) Epoch 2, batch 1600, loss[loss=0.2483, simple_loss=0.3075, pruned_loss=0.09458, over 24302.00 frames. ], tot_loss[loss=0.3227, simple_loss=0.3596, pruned_loss=0.1429, over 4700271.01 frames. ], batch size: 56, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:27:14,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=46080.0, ans=0.125 2023-09-28 14:27:15,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:27:15,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 14:27:17,235 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.915e+02 2.937e+02 3.574e+02 4.472e+02 6.126e+02, threshold=7.147e+02, percent-clipped=0.0 2023-09-28 14:27:17,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:27:18,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:18,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:27:18,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:27:19,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:27:22,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:24,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 14:27:26,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 14:27:27,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 14:27:28,575 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.53 vs. limit=15.0 2023-09-28 14:27:29,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:27:29,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 14:27:31,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:27:35,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:27:38,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:27:39,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=46146.666666666664, ans=0.125 2023-09-28 14:27:42,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 14:27:45,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:27:45,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 14:27:45,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=46213.333333333336, ans=0.2 2023-09-28 14:27:46,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:49,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 14:27:54,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=46213.333333333336, ans=0.0 2023-09-28 14:27:55,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 14:28:02,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:05,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 14:28:07,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:07,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:07,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:28:10,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 14:28:15,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 14:28:17,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:28:18,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:28:20,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:28:20,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=46346.666666666664, ans=0.025 2023-09-28 14:28:21,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:28:23,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:28:30,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:32,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:28:33,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 14:28:33,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:28:35,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 14:28:37,320 INFO [train.py:1039] (2/4) Epoch 2, batch 1650, loss[loss=0.3249, simple_loss=0.3414, pruned_loss=0.1542, over 23453.00 frames. ], tot_loss[loss=0.3222, simple_loss=0.3595, pruned_loss=0.1424, over 4712370.91 frames. ], batch size: 285, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:28:40,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:42,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:28:43,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:28:43,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 14:28:43,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 14:28:43,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 14:28:45,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 14:28:46,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=46413.333333333336, ans=0.0007797101449275347 2023-09-28 14:28:47,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:47,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:47,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=46413.333333333336, ans=0.125 2023-09-28 14:28:49,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:28:49,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:28:51,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=46413.333333333336, ans=0.125 2023-09-28 14:28:52,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:54,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 14:28:57,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:57,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:57,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:28:57,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:28:57,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 14:28:57,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 14:29:04,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:29:05,592 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.27 vs. limit=22.5 2023-09-28 14:29:08,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:29:16,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 14:29:17,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:21,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 14:29:24,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:26,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:29:26,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:29:27,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:29,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:29:29,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:29:32,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:33,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:33,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:36,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:29:38,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:38,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 14:29:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:41,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 14:29:43,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 14:29:43,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 14:29:43,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:44,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:29:44,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:46,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:46,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 14:29:51,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:53,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=46680.0, ans=0.125 2023-09-28 14:29:54,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:29:54,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:56,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 14:30:00,459 INFO [train.py:1039] (2/4) Epoch 2, batch 1700, loss[loss=0.2799, simple_loss=0.3309, pruned_loss=0.1144, over 24478.00 frames. ], tot_loss[loss=0.3212, simple_loss=0.3585, pruned_loss=0.1419, over 4704693.69 frames. ], batch size: 66, lr: 3.77e-02, grad_scale: 16.0 2023-09-28 14:30:00,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:30:00,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:30:00,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 14:30:02,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:02,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:30:02,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:05,053 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.163e+02 2.907e+02 3.272e+02 3.830e+02 6.451e+02, threshold=6.545e+02, percent-clipped=0.0 2023-09-28 14:30:05,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:30:05,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:30:05,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 14:30:10,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:30:10,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=46746.666666666664, ans=0.125 2023-09-28 14:30:18,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:21,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:30:28,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:30:28,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:28,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:30,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:30:31,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 14:30:35,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:30:35,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:36,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:30:38,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:30:40,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-09-28 14:30:41,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 14:30:41,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 14:30:43,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:43,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 14:30:45,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:30:47,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=46880.0, ans=0.125 2023-09-28 14:30:54,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:30:54,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:30:54,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=46946.666666666664, ans=0.125 2023-09-28 14:30:56,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:57,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:30:57,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 14:30:57,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:31:00,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:00,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 14:31:00,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:00,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:00,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:02,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:05,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:05,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:31:05,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:07,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:31:07,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:10,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:11,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 14:31:12,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=47013.333333333336, ans=0.0006492753623188394 2023-09-28 14:31:15,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:17,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:18,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 14:31:24,032 INFO [train.py:1039] (2/4) Epoch 2, batch 1750, loss[loss=0.3199, simple_loss=0.3448, pruned_loss=0.1475, over 23691.00 frames. ], tot_loss[loss=0.3204, simple_loss=0.358, pruned_loss=0.1414, over 4705813.83 frames. ], batch size: 232, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:31:27,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:27,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=47080.0, ans=0.1 2023-09-28 14:31:30,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:30,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:31:32,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 14:31:32,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:32,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=47080.0, ans=0.0 2023-09-28 14:31:35,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:31:35,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:40,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 14:31:43,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:44,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 14:31:46,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:47,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:31:48,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=47146.666666666664, ans=0.125 2023-09-28 14:31:49,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:31:50,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 14:31:53,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:53,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 14:32:01,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:32:05,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:05,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:08,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:08,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:08,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=47213.333333333336, ans=0.0006057971014492743 2023-09-28 14:32:11,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:32:12,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:14,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:16,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:32:16,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=47280.0, ans=0.125 2023-09-28 14:32:17,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 14:32:19,811 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.19 vs. limit=15.0 2023-09-28 14:32:20,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:22,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 14:32:24,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:27,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:27,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:32:33,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:32:33,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:32:33,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=47346.666666666664, ans=0.125 2023-09-28 14:32:34,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:36,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:42,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:44,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:32:44,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:32:44,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 14:32:44,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:45,824 INFO [train.py:1039] (2/4) Epoch 2, batch 1800, loss[loss=0.3354, simple_loss=0.3706, pruned_loss=0.1501, over 23432.00 frames. ], tot_loss[loss=0.3182, simple_loss=0.3561, pruned_loss=0.1402, over 4702473.08 frames. ], batch size: 93, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:32:47,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:32:47,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:32:47,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:32:47,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:32:49,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:32:50,428 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.289e+02 2.760e+02 3.253e+02 3.974e+02 7.457e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 14:32:52,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:32:52,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:53,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:32:56,028 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.84 vs. limit=15.0 2023-09-28 14:32:56,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:33:00,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:33:02,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:33:05,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:09,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:09,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:11,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:33:12,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:33:12,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 14:33:16,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:19,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:22,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 14:33:23,708 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.40 vs. limit=5.0 2023-09-28 14:33:25,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 14:33:25,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 14:33:25,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:27,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:27,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:33:27,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:33:35,887 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 14:33:36,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=47613.333333333336, ans=0.0 2023-09-28 14:33:37,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:33:39,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:41,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 14:33:41,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 14:33:43,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:33:44,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:33:46,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:33:50,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 14:33:56,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:33:58,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 14:33:59,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:33:59,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:59,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:34:00,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=47680.0, ans=0.125 2023-09-28 14:34:01,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 14:34:03,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=47680.0, ans=0.2 2023-09-28 14:34:04,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:34:04,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:07,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 14:34:07,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:09,036 INFO [train.py:1039] (2/4) Epoch 2, batch 1850, loss[loss=0.3034, simple_loss=0.3545, pruned_loss=0.1262, over 24691.00 frames. ], tot_loss[loss=0.3187, simple_loss=0.3566, pruned_loss=0.1404, over 4710119.96 frames. ], batch size: 65, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:34:10,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:10,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:34:10,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:12,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:12,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:34:16,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:34:16,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:19,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:34:21,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:34:29,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:34:29,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 14:34:32,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 14:34:33,774 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.74 vs. limit=6.0 2023-09-28 14:34:34,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 14:34:37,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:37,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 14:34:37,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:34:47,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:34:49,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 14:34:53,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=47880.0, ans=0.1 2023-09-28 14:34:54,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:34:54,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:34:56,044 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.39 vs. limit=15.0 2023-09-28 14:34:58,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 14:34:59,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:59,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:34:59,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:35:02,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:35:05,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:07,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:35:08,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:08,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:35:08,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:10,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:13,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:35:16,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 14:35:18,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:22,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:35:22,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:35:22,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 14:35:22,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 14:35:25,326 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 14:35:25,461 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 14:35:29,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:35:29,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:35:29,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:29,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:31,253 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 14:35:31,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:35:31,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:32,687 INFO [train.py:1039] (2/4) Epoch 2, batch 1900, loss[loss=0.3164, simple_loss=0.3743, pruned_loss=0.1293, over 24291.00 frames. ], tot_loss[loss=0.3201, simple_loss=0.3579, pruned_loss=0.1411, over 4699657.85 frames. ], batch size: 74, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:35:32,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:35:32,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:35:34,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:35:34,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 14:35:37,582 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.980e+02 2.945e+02 3.315e+02 3.866e+02 6.379e+02, threshold=6.630e+02, percent-clipped=0.0 2023-09-28 14:35:37,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:37,789 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 14:35:37,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:35:39,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:45,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:45,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=48080.0, ans=0.1 2023-09-28 14:35:47,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=48146.666666666664, ans=0.2 2023-09-28 14:35:48,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:35:50,171 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 14:35:50,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 14:35:51,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:53,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:53,264 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 14:35:53,305 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 14:35:57,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=48146.666666666664, ans=0.05 2023-09-28 14:35:58,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 14:36:00,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:36:04,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 14:36:04,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=48213.333333333336, ans=0.125 2023-09-28 14:36:06,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 14:36:17,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 14:36:20,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 14:36:20,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:36:20,764 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 14:36:20,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 14:36:22,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 14:36:22,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 14:36:22,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:36:27,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 14:36:30,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:36:30,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=48280.0, ans=0.00037391304347826095 2023-09-28 14:36:32,540 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:36:35,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:35,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 14:36:37,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:36:42,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 14:36:42,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:47,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:36:47,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:36:47,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:36:49,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:36:50,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:36:51,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:36:52,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:36:55,505 INFO [train.py:1039] (2/4) Epoch 2, batch 1950, loss[loss=0.3361, simple_loss=0.3657, pruned_loss=0.1532, over 23397.00 frames. ], tot_loss[loss=0.3215, simple_loss=0.3592, pruned_loss=0.1419, over 4692209.68 frames. ], batch size: 285, lr: 3.74e-02, grad_scale: 16.0 2023-09-28 14:36:55,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:36:55,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:36:57,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:36:57,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:58,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:58,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:37:03,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:05,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:37:07,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:07,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:37:09,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 14:37:09,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:37:10,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:12,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:14,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:37:14,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:16,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:17,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:21,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:21,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:37:21,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:37:21,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:25,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:29,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:37:29,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:30,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:37:30,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 14:37:31,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:37:31,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:37:31,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:31,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=48546.666666666664, ans=0.07 2023-09-28 14:37:34,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:37,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:37:43,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:37:44,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:37:46,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:37:46,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 14:37:46,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:37:52,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:53,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:37:55,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:37:58,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=48613.333333333336, ans=0.1 2023-09-28 14:38:03,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:04,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:07,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:09,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:11,540 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.77 vs. limit=15.0 2023-09-28 14:38:12,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:38:13,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:15,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 14:38:15,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:38:15,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:38:17,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 14:38:18,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=48746.666666666664, ans=0.2 2023-09-28 14:38:19,177 INFO [train.py:1039] (2/4) Epoch 2, batch 2000, loss[loss=0.3199, simple_loss=0.3696, pruned_loss=0.1351, over 24338.00 frames. ], tot_loss[loss=0.321, simple_loss=0.3594, pruned_loss=0.1412, over 4702405.71 frames. ], batch size: 77, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:38:19,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:22,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:38:23,953 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.784e+02 3.169e+02 3.809e+02 6.996e+02, threshold=6.339e+02, percent-clipped=1.0 2023-09-28 14:38:24,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:38:24,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:38:26,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:38:29,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:32,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 14:38:33,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:38:35,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:38:37,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 14:38:38,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:38:38,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:40,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:38:43,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 14:38:44,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:50,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 14:38:50,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:38:53,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 14:38:53,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:38:57,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:38:58,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:38:58,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:58,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:00,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:02,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 14:39:04,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 14:39:06,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:39:06,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:10,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:12,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:39:12,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:12,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:39:13,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:15,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:15,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:15,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:17,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:20,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:20,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 14:39:25,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:39:27,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:39:36,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:37,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:37,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:39,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:39:40,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:39:42,369 INFO [train.py:1039] (2/4) Epoch 2, batch 2050, loss[loss=0.3386, simple_loss=0.3291, pruned_loss=0.1741, over 19280.00 frames. ], tot_loss[loss=0.3212, simple_loss=0.359, pruned_loss=0.1417, over 4684327.83 frames. ], batch size: 389, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:39:42,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:44,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:47,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:48,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:52,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:52,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=49080.0, ans=0.0 2023-09-28 14:39:55,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:39:56,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:56,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:39:59,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 14:39:59,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:39:59,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:59,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:40:12,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:12,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:14,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 14:40:15,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:17,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 14:40:17,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:20,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:23,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:25,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:40:25,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:26,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:40:29,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:40:29,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:40:30,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.74 vs. limit=15.0 2023-09-28 14:40:33,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:33,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=49280.0, ans=0.00015652173913043542 2023-09-28 14:40:35,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:40:38,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:40:39,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:40:45,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:40:50,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:40:50,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 14:40:56,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:40:57,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:40:57,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=49346.666666666664, ans=0.1 2023-09-28 14:41:00,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:41:01,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 14:41:05,372 INFO [train.py:1039] (2/4) Epoch 2, batch 2100, loss[loss=0.3039, simple_loss=0.3289, pruned_loss=0.1395, over 23596.00 frames. ], tot_loss[loss=0.3187, simple_loss=0.3565, pruned_loss=0.1404, over 4687476.80 frames. ], batch size: 256, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:41:06,930 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 14:41:06,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:07,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:08,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:09,921 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.967e+02 2.956e+02 3.430e+02 4.185e+02 6.974e+02, threshold=6.859e+02, percent-clipped=1.0 2023-09-28 14:41:10,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:41:10,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 14:41:10,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 14:41:12,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:41:14,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=49413.333333333336, ans=0.0 2023-09-28 14:41:16,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:41:17,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:41:20,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:20,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:41:20,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 14:41:22,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:41:23,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 14:41:23,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 14:41:25,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:26,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:41:26,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 14:41:26,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 14:41:30,874 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.05 vs. limit=15.0 2023-09-28 14:41:32,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 14:41:32,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:34,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:41:36,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:39,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:41:39,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 14:41:39,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:39,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:41:43,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 14:41:44,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:44,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 14:41:44,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 14:41:44,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 14:41:46,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:41:49,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:41:52,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:55,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:56,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:56,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:56,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 14:41:58,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:58,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:58,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:58,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 14:41:59,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 14:42:01,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 14:42:05,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:42:10,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:42:10,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 14:42:17,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:18,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:42:18,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:18,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:42:20,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:42:20,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:42:22,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:22,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:42:24,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:42:24,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:26,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 14:42:27,806 INFO [train.py:1039] (2/4) Epoch 2, batch 2150, loss[loss=0.3296, simple_loss=0.353, pruned_loss=0.1531, over 22884.00 frames. ], tot_loss[loss=0.3181, simple_loss=0.3569, pruned_loss=0.1397, over 4709365.62 frames. ], batch size: 322, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:42:27,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 14:42:27,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:30,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:42:30,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:42:30,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:42:31,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:42:37,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:42:38,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:38,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:40,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:42:40,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:40,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:42:45,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:45,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:42:45,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:42:49,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=49813.333333333336, ans=0.125 2023-09-28 14:42:50,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:50,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 14:42:54,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:42:57,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:42:58,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:58,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:42:58,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:00,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:43:00,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:43:00,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 14:43:01,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:43:03,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:03,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:05,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:43:06,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:43:09,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:09,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:43:11,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:11,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 14:43:11,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:43:14,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:15,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:17,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:17,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=49946.666666666664, ans=0.125 2023-09-28 14:43:19,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:43:19,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:21,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:21,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 14:43:22,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 14:43:24,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:43:24,114 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 14:43:25,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:27,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:43:27,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 14:43:27,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:43:27,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 14:43:29,685 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 14:43:29,686 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 14:43:31,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 14:43:31,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:32,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:32,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:43:34,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:34,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:43:37,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:37,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:45,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:43:46,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 14:43:46,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=50080.0, ans=0.125 2023-09-28 14:43:47,922 INFO [train.py:1039] (2/4) Epoch 2, batch 2200, loss[loss=0.3278, simple_loss=0.3553, pruned_loss=0.1502, over 23776.00 frames. ], tot_loss[loss=0.3177, simple_loss=0.3564, pruned_loss=0.1395, over 4704405.37 frames. ], batch size: 212, lr: 3.71e-02, grad_scale: 32.0 2023-09-28 14:43:48,376 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:43:51,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:43:52,566 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.160e+02 2.946e+02 3.281e+02 3.928e+02 6.005e+02, threshold=6.562e+02, percent-clipped=0.0 2023-09-28 14:43:56,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:56,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:43:56,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:59,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:44:02,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:02,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:44:03,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 14:44:07,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=50146.666666666664, ans=0.125 2023-09-28 14:44:08,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 14:44:08,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.11 vs. limit=15.0 2023-09-28 14:44:09,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:44:10,711 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.02 vs. limit=6.0 2023-09-28 14:44:16,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 14:44:19,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:19,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:19,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:44:22,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:44:22,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 14:44:27,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:44:28,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:29,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 14:44:34,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:44:34,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:38,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:44:40,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:41,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 14:44:43,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:44,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 14:44:45,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=50280.0, ans=0.2 2023-09-28 14:44:46,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:46,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:44:46,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:48,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:49,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:49,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:49,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:51,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:44:51,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:44:52,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:44:54,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=50346.666666666664, ans=0.125 2023-09-28 14:44:54,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=50346.666666666664, ans=0.125 2023-09-28 14:44:56,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:44:56,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:59,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:45:01,375 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 14:45:03,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:45:04,538 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 14:45:04,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:45:06,845 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 14:45:07,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=50346.666666666664, ans=0.035 2023-09-28 14:45:08,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:08,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:45:09,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:11,958 INFO [train.py:1039] (2/4) Epoch 2, batch 2250, loss[loss=0.3421, simple_loss=0.3889, pruned_loss=0.1477, over 24059.00 frames. ], tot_loss[loss=0.3183, simple_loss=0.3569, pruned_loss=0.1399, over 4707090.03 frames. ], batch size: 80, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:45:12,181 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 14:45:13,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:45:15,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:22,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:45:23,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:45:26,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:27,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:29,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:29,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=50480.0, ans=0.09899494936611666 2023-09-28 14:45:32,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 14:45:32,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:45:32,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:45:32,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=50480.0, ans=0.125 2023-09-28 14:45:35,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 14:45:35,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:45:37,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:38,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:42,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=50546.666666666664, ans=0.0 2023-09-28 14:45:44,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:45:45,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:45:45,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:45:47,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 14:45:49,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:50,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:45:57,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:45:57,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=50546.666666666664, ans=0.05 2023-09-28 14:45:58,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:45:59,214 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.86 vs. limit=22.5 2023-09-28 14:46:00,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:00,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:46:01,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:46:03,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:46:10,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:46:11,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:46:18,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:46:18,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:46:20,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:46:24,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:46:26,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=50680.0, ans=0.125 2023-09-28 14:46:29,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:46:29,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 14:46:30,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:30,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:46:32,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 14:46:33,557 INFO [train.py:1039] (2/4) Epoch 2, batch 2300, loss[loss=0.2684, simple_loss=0.3202, pruned_loss=0.1083, over 24423.00 frames. ], tot_loss[loss=0.318, simple_loss=0.3569, pruned_loss=0.1395, over 4702257.31 frames. ], batch size: 58, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:46:35,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:46:35,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:38,467 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.122e+02 3.000e+02 3.557e+02 4.160e+02 8.082e+02, threshold=7.115e+02, percent-clipped=2.0 2023-09-28 14:46:40,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=50746.666666666664, ans=0.125 2023-09-28 14:46:41,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:41,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:46:41,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=50746.666666666664, ans=0.125 2023-09-28 14:46:45,337 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 14:46:46,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:50,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=50813.333333333336, ans=0.125 2023-09-28 14:46:53,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:46:53,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:46:54,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:46:54,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:54,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 14:46:57,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:46:58,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=50813.333333333336, ans=0.125 2023-09-28 14:47:00,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:00,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:47:02,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=50813.333333333336, ans=0.125 2023-09-28 14:47:03,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:47:06,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:47:10,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.71 vs. limit=6.0 2023-09-28 14:47:10,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:12,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=50880.0, ans=0.0 2023-09-28 14:47:16,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:47:17,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:47:17,711 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.87 vs. limit=15.0 2023-09-28 14:47:20,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:47:22,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:47:27,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:27,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:47:27,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:47:27,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 14:47:31,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=50946.666666666664, ans=0.2 2023-09-28 14:47:33,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:47:33,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:33,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:33,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:47:33,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:34,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 14:47:34,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:47:36,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 14:47:36,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:47:36,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:37,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 14:47:43,427 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.93 vs. limit=10.0 2023-09-28 14:47:44,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:47:47,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:47:52,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:52,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:47:52,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:47:52,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:47:54,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:47:55,605 INFO [train.py:1039] (2/4) Epoch 2, batch 2350, loss[loss=0.3207, simple_loss=0.3632, pruned_loss=0.1391, over 23290.00 frames. ], tot_loss[loss=0.3186, simple_loss=0.3581, pruned_loss=0.1396, over 4708099.94 frames. ], batch size: 93, lr: 3.69e-02, grad_scale: 32.0 2023-09-28 14:47:55,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:47:55,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 14:48:01,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:01,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 14:48:08,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 14:48:11,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:48:14,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:15,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:15,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 14:48:19,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:48:24,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 14:48:27,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:29,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:48:29,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:48:32,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:48:32,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 14:48:34,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:48:36,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:36,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:48:36,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=51213.333333333336, ans=0.125 2023-09-28 14:48:37,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:48:42,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:48:44,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 14:48:44,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:47,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:47,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:48:49,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 14:48:50,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:48:55,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 14:48:55,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:49:00,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 14:49:05,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 14:49:05,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:49:05,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:49:06,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=51346.666666666664, ans=0.1 2023-09-28 14:49:07,320 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 14:49:07,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 14:49:08,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 14:49:13,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=51346.666666666664, ans=0.2 2023-09-28 14:49:14,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:49:18,675 INFO [train.py:1039] (2/4) Epoch 2, batch 2400, loss[loss=0.3117, simple_loss=0.3699, pruned_loss=0.1267, over 24341.00 frames. ], tot_loss[loss=0.3179, simple_loss=0.3568, pruned_loss=0.1395, over 4701340.74 frames. ], batch size: 74, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:49:18,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:49:23,880 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.787e+02 3.340e+02 4.281e+02 7.222e+02, threshold=6.680e+02, percent-clipped=0.0 2023-09-28 14:49:24,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:49:24,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:49:25,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 14:49:25,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 14:49:33,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:49:33,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:49:36,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 14:49:36,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:49:37,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:37,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 14:49:44,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:46,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 14:49:51,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:49:57,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 14:49:57,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=51546.666666666664, ans=0.125 2023-09-28 14:50:00,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:03,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:06,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:06,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 14:50:08,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:50:17,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:18,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:20,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=51613.333333333336, ans=0.05 2023-09-28 14:50:22,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:24,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:50:24,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:50:24,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:50:24,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:24,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:24,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:50:27,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=51680.0, ans=0.0 2023-09-28 14:50:29,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=51680.0, ans=0.125 2023-09-28 14:50:30,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:50:30,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:50:32,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 14:50:32,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 14:50:35,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:35,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:35,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 14:50:35,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 14:50:37,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 14:50:37,048 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 14:50:37,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 14:50:37,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:38,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:38,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:40,538 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 14:50:41,855 INFO [train.py:1039] (2/4) Epoch 2, batch 2450, loss[loss=0.3088, simple_loss=0.36, pruned_loss=0.1288, over 24668.00 frames. ], tot_loss[loss=0.3161, simple_loss=0.3557, pruned_loss=0.1383, over 4715183.66 frames. ], batch size: 73, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:50:42,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:44,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:50:44,752 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.19 vs. limit=10.0 2023-09-28 14:50:46,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=51746.666666666664, ans=0.125 2023-09-28 14:50:47,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:50:47,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:52,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:52,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:52,574 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:50:52,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=51746.666666666664, ans=0.2 2023-09-28 14:50:52,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=51746.666666666664, ans=0.125 2023-09-28 14:50:53,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 14:50:59,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:59,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:02,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:51:02,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:51:02,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:51:02,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 14:51:07,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:09,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:51:09,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:51:10,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:51:12,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:14,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:14,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:51:16,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=51880.0, ans=0.1 2023-09-28 14:51:17,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 14:51:17,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:51:23,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=51880.0, ans=0.125 2023-09-28 14:51:29,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:30,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:30,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:32,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:51:32,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:33,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:51:35,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 14:51:37,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:37,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:51:42,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:51:42,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:49,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:51:49,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 14:51:49,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:51:51,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:51:51,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 14:51:51,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:51:52,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:51:57,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:52:01,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:01,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:52:04,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 14:52:05,483 INFO [train.py:1039] (2/4) Epoch 2, batch 2500, loss[loss=0.3109, simple_loss=0.3587, pruned_loss=0.1316, over 23426.00 frames. ], tot_loss[loss=0.3138, simple_loss=0.3538, pruned_loss=0.1369, over 4717763.31 frames. ], batch size: 93, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:52:05,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:52:10,765 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.999e+02 2.754e+02 3.242e+02 3.766e+02 6.714e+02, threshold=6.484e+02, percent-clipped=2.0 2023-09-28 14:52:12,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:18,655 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.34 vs. limit=22.5 2023-09-28 14:52:22,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:52:22,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:52:22,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=52146.666666666664, ans=0.125 2023-09-28 14:52:25,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:25,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 14:52:32,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:52:34,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:52:35,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:52:35,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:52:36,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 14:52:38,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:38,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:40,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 14:52:40,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:41,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 14:52:41,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:52:45,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:47,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:50,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:52:50,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 14:52:50,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:52:52,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:57,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:01,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:01,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=52280.0, ans=0.0 2023-09-28 14:53:02,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=52280.0, ans=0.125 2023-09-28 14:53:02,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=52280.0, ans=0.2 2023-09-28 14:53:05,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:10,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:53:12,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 14:53:12,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:12,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:16,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:53:16,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:53:17,917 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 14:53:17,917 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 14:53:17,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 14:53:19,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:22,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 14:53:22,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 14:53:22,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:24,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 14:53:29,225 INFO [train.py:1039] (2/4) Epoch 2, batch 2550, loss[loss=0.2606, simple_loss=0.3064, pruned_loss=0.1074, over 24446.00 frames. ], tot_loss[loss=0.3144, simple_loss=0.354, pruned_loss=0.1374, over 4715627.47 frames. ], batch size: 58, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:53:29,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 14:53:32,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:33,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:53:35,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:53:36,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:37,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 14:53:37,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:53:41,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 14:53:43,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:53:44,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.02 vs. limit=15.0 2023-09-28 14:53:46,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:49,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:49,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 14:53:49,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:53:49,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:53:49,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:53,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:53:53,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 14:53:53,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=52480.0, ans=0.0 2023-09-28 14:53:54,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:54,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:54,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 14:54:03,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=52546.666666666664, ans=0.0 2023-09-28 14:54:06,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:54:11,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:11,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:11,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:54:13,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:54:19,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=52613.333333333336, ans=0.125 2023-09-28 14:54:21,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:54:24,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:54:24,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:54:24,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:54:25,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:54:25,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:54:31,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:31,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:37,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:54:39,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 14:54:39,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:54:39,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:40,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:54:42,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:54:44,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:52,139 INFO [train.py:1039] (2/4) Epoch 2, batch 2600, loss[loss=0.3184, simple_loss=0.3564, pruned_loss=0.1402, over 23563.00 frames. ], tot_loss[loss=0.3147, simple_loss=0.3544, pruned_loss=0.1375, over 4713632.95 frames. ], batch size: 134, lr: 3.66e-02, grad_scale: 32.0 2023-09-28 14:54:52,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:54:54,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:57,272 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.901e+02 3.329e+02 4.085e+02 7.147e+02, threshold=6.657e+02, percent-clipped=2.0 2023-09-28 14:54:57,450 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 14:54:59,167 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 14:54:59,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:54:59,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 14:55:00,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 14:55:00,864 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 14:55:02,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:55:02,563 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 14:55:06,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 14:55:07,628 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 14:55:11,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:55:12,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 14:55:14,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 14:55:16,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:55:16,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 14:55:19,579 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 14:55:19,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 14:55:24,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=52880.0, ans=0.1 2023-09-28 14:55:26,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:27,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:27,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:27,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 14:55:30,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:55:35,690 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 14:55:42,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:42,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:42,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 14:55:42,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:55:42,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:44,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 14:55:46,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=52946.666666666664, ans=0.125 2023-09-28 14:55:46,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=52946.666666666664, ans=0.125 2023-09-28 14:55:47,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:55:47,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:55:50,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:53,013 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 14:55:54,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:54,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:55:54,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=52946.666666666664, ans=0.125 2023-09-28 14:56:02,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:56:02,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:56:04,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 14:56:04,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:56:07,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:07,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:07,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=53013.333333333336, ans=0.0 2023-09-28 14:56:09,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=53013.333333333336, ans=0.1 2023-09-28 14:56:13,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 14:56:13,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:15,305 INFO [train.py:1039] (2/4) Epoch 2, batch 2650, loss[loss=0.3133, simple_loss=0.3599, pruned_loss=0.1333, over 24646.00 frames. ], tot_loss[loss=0.3152, simple_loss=0.3554, pruned_loss=0.1375, over 4721780.88 frames. ], batch size: 65, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:56:15,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:56:17,741 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=16.57 vs. limit=15.0 2023-09-28 14:56:19,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 14:56:19,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:20,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:56:22,251 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 14:56:22,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:56:25,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:27,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:56:29,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:29,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=53080.0, ans=0.1 2023-09-28 14:56:29,994 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.56 vs. limit=15.0 2023-09-28 14:56:30,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:56:31,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=53146.666666666664, ans=0.0 2023-09-28 14:56:32,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 14:56:32,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:56:32,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:56:35,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 14:56:38,964 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 14:56:42,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:45,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 14:56:45,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:56:47,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 14:56:49,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=53213.333333333336, ans=0.0 2023-09-28 14:56:50,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:50,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:56:50,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:52,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:56:57,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 14:56:57,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 14:56:59,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:00,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.38 vs. limit=10.0 2023-09-28 14:57:01,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=53213.333333333336, ans=0.125 2023-09-28 14:57:03,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 14:57:04,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:05,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:06,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:06,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:06,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:08,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:11,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:12,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:57:13,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:57:13,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:57:15,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:16,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:57:16,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:21,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:23,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:57:27,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:28,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:57:28,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:30,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 14:57:35,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:36,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:38,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:39,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:40,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:40,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:42,110 INFO [train.py:1039] (2/4) Epoch 2, batch 2700, loss[loss=0.3149, simple_loss=0.374, pruned_loss=0.1279, over 24644.00 frames. ], tot_loss[loss=0.3159, simple_loss=0.356, pruned_loss=0.1379, over 4718619.18 frames. ], batch size: 73, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:57:42,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:57:42,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 14:57:45,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:57:46,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.066e+02 2.772e+02 3.228e+02 4.080e+02 7.773e+02, threshold=6.457e+02, percent-clipped=3.0 2023-09-28 14:57:47,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 14:57:48,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:50,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:57:51,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=53413.333333333336, ans=0.125 2023-09-28 14:57:52,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:52,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:57:52,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:57:52,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 14:57:53,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:57:53,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:55,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:57:55,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:55,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=53413.333333333336, ans=0.04949747468305833 2023-09-28 14:57:59,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:58:00,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 14:58:00,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:09,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:58:09,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:14,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:58:14,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:58:14,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:58:14,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:58:19,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:22,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:22,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:58:22,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:58:26,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=53546.666666666664, ans=0.0 2023-09-28 14:58:29,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:29,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:58:31,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=53613.333333333336, ans=0.1 2023-09-28 14:58:38,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:58:38,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:58:39,952 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:58:41,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:58:42,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:58:46,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=53680.0, ans=0.0 2023-09-28 14:58:47,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:49,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:50,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:50,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:58:52,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:52,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:58:56,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:58,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:58,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:59:00,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 14:59:02,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:03,503 INFO [train.py:1039] (2/4) Epoch 2, batch 2750, loss[loss=0.3182, simple_loss=0.338, pruned_loss=0.1492, over 23656.00 frames. ], tot_loss[loss=0.3152, simple_loss=0.3554, pruned_loss=0.1375, over 4720312.69 frames. ], batch size: 232, lr: 3.64e-02, grad_scale: 16.0 2023-09-28 14:59:05,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:59:05,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 14:59:07,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 14:59:07,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:08,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:08,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:12,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:12,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:59:12,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:17,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:18,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.31 vs. limit=10.0 2023-09-28 14:59:18,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:59:18,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:59:18,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:18,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 14:59:18,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:59:18,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:25,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 14:59:27,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:59:27,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:27,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=53813.333333333336, ans=0.125 2023-09-28 14:59:28,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:59:29,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:59:30,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:32,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:59:32,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:33,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:35,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=53880.0, ans=0.125 2023-09-28 14:59:38,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:59:38,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:59:38,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:59:40,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:42,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:59:48,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:50,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:59:50,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:55,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:55,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:59:57,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:59:59,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=53946.666666666664, ans=0.125 2023-09-28 15:00:02,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:00:02,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:00:02,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 15:00:06,379 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.67 vs. limit=15.0 2023-09-28 15:00:08,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:10,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 15:00:14,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=54013.333333333336, ans=0.125 2023-09-28 15:00:15,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=54013.333333333336, ans=0.1 2023-09-28 15:00:16,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:00:17,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.10 vs. limit=15.0 2023-09-28 15:00:18,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:00:18,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 15:00:20,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:00:23,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:00:23,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 15:00:24,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:00:26,417 INFO [train.py:1039] (2/4) Epoch 2, batch 2800, loss[loss=0.3063, simple_loss=0.3371, pruned_loss=0.1377, over 23751.00 frames. ], tot_loss[loss=0.3143, simple_loss=0.3538, pruned_loss=0.1375, over 4705130.35 frames. ], batch size: 179, lr: 3.64e-02, grad_scale: 32.0 2023-09-28 15:00:28,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:00:28,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:28,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:00:29,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 15:00:29,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:29,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:31,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:33,363 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.006e+02 2.948e+02 3.600e+02 4.282e+02 6.554e+02, threshold=7.201e+02, percent-clipped=1.0 2023-09-28 15:00:33,509 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 15:00:33,510 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 15:00:36,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:38,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:00:38,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:00:41,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:00:43,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 15:00:47,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:00:48,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 15:00:50,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:50,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:00:50,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:00:52,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=54146.666666666664, ans=0.0 2023-09-28 15:00:53,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:00:53,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:53,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:00:56,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:03,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:01:05,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:08,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=54213.333333333336, ans=0.1 2023-09-28 15:01:10,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:10,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:01:10,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=54213.333333333336, ans=0.0 2023-09-28 15:01:12,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:15,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:15,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 15:01:16,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:18,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:18,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:01:21,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.91 vs. limit=15.0 2023-09-28 15:01:22,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:23,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:26,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:28,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:01:28,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:28,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:01:28,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:01:28,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:01:30,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:01:30,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 15:01:32,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:34,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:34,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:35,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 15:01:36,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:36,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:01:38,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:01:39,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 15:01:46,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:46,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:01:46,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:01:49,653 INFO [train.py:1039] (2/4) Epoch 2, batch 2850, loss[loss=0.2874, simple_loss=0.3394, pruned_loss=0.1178, over 24437.00 frames. ], tot_loss[loss=0.3124, simple_loss=0.3522, pruned_loss=0.1363, over 4712071.20 frames. ], batch size: 63, lr: 3.63e-02, grad_scale: 32.0 2023-09-28 15:01:49,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:01:56,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:01:56,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:56,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:02:01,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:01,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:02:02,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:02:02,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 15:02:09,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 15:02:09,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:12,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 15:02:13,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:15,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 15:02:15,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=54480.0, ans=0.0 2023-09-28 15:02:17,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 15:02:17,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:17,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=54480.0, ans=0.125 2023-09-28 15:02:21,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=54546.666666666664, ans=0.0 2023-09-28 15:02:29,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=54546.666666666664, ans=0.04949747468305833 2023-09-28 15:02:31,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:32,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:32,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:02:34,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:02:34,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:02:34,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:02:36,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:02:36,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 15:02:39,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:02:39,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:02:39,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:41,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:44,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:44,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:45,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:48,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:50,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:02:52,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:53,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:56,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:02:56,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=54680.0, ans=0.0 2023-09-28 15:03:02,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:03:04,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 15:03:04,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 15:03:06,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:03:08,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:08,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 15:03:09,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:03:09,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:10,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:10,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:03:10,930 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 15:03:10,984 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 15:03:10,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:12,387 INFO [train.py:1039] (2/4) Epoch 2, batch 2900, loss[loss=0.3366, simple_loss=0.3597, pruned_loss=0.1567, over 23683.00 frames. ], tot_loss[loss=0.3127, simple_loss=0.3527, pruned_loss=0.1364, over 4713413.38 frames. ], batch size: 232, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:03:12,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:15,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:15,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:18,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:03:19,519 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.086e+02 2.913e+02 3.691e+02 4.538e+02 7.186e+02, threshold=7.382e+02, percent-clipped=0.0 2023-09-28 15:03:19,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 15:03:22,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:22,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 15:03:24,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 15:03:26,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:03:26,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:03:29,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:29,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=54813.333333333336, ans=0.2 2023-09-28 15:03:30,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:03:34,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:34,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:39,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:03:39,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 15:03:39,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:03:42,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:43,466 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.28 vs. limit=12.0 2023-09-28 15:03:44,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 15:03:44,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 15:03:44,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=54880.0, ans=0.125 2023-09-28 15:03:47,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:47,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 15:03:47,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:03:50,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:03:50,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:54,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=54880.0, ans=0.125 2023-09-28 15:03:55,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:57,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:04:00,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:04:01,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:05,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 15:04:06,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 15:04:06,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:04:10,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:04:12,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 15:04:14,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:04:20,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:04:29,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:04:29,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:04:29,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=55013.333333333336, ans=0.1 2023-09-28 15:04:31,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 15:04:34,435 INFO [train.py:1039] (2/4) Epoch 2, batch 2950, loss[loss=0.3054, simple_loss=0.3629, pruned_loss=0.1239, over 24289.00 frames. ], tot_loss[loss=0.3119, simple_loss=0.3527, pruned_loss=0.1356, over 4728388.56 frames. ], batch size: 74, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:04:34,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:34,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 15:04:34,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:36,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:04:41,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:43,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 15:04:44,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:44,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:46,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:04:46,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:04:48,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 15:04:49,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 15:04:50,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:04:50,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:58,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:04:59,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:01,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:03,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:07,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:07,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:05:08,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:05:11,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 15:05:16,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 15:05:16,829 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 15:05:18,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:05:20,643 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 15:05:23,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 15:05:23,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:24,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:05:24,499 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 15:05:24,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:05:27,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 15:05:27,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:27,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:05:30,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:32,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:05:33,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:33,856 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 15:05:33,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:35,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 15:05:40,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:42,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:05:42,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 15:05:42,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:05:45,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 15:05:46,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:48,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:48,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=55346.666666666664, ans=0.125 2023-09-28 15:05:50,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:05:51,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:52,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:05:54,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:05:54,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:54,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:05:54,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:05:56,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:56,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:05:57,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:58,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 15:05:59,270 INFO [train.py:1039] (2/4) Epoch 2, batch 3000, loss[loss=0.3295, simple_loss=0.352, pruned_loss=0.1535, over 23855.00 frames. ], tot_loss[loss=0.314, simple_loss=0.3539, pruned_loss=0.137, over 4727314.95 frames. ], batch size: 195, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:05:59,270 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 15:06:14,693 INFO [train.py:1071] (2/4) Epoch 2, validation: loss=0.3279, simple_loss=0.3383, pruned_loss=0.1588, over 1125622.00 frames. 2023-09-28 15:06:14,694 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 15:06:14,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:06:17,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:06:17,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:06:20,855 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.946e+02 3.548e+02 4.220e+02 7.965e+02, threshold=7.096e+02, percent-clipped=1.0 2023-09-28 15:06:20,965 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 15:06:21,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 15:06:22,622 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.91 vs. limit=15.0 2023-09-28 15:06:23,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:06:23,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:06:25,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 15:06:25,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:33,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:06:43,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:06:50,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 15:06:52,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:06:55,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:06:56,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:56,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:06:58,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:06:58,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 15:07:01,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 15:07:02,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:07:02,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:07:03,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=55613.333333333336, ans=0.125 2023-09-28 15:07:04,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:07:04,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:04,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=55613.333333333336, ans=0.0 2023-09-28 15:07:06,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:06,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:07:09,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:07:10,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:07:10,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:07:13,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:16,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 15:07:16,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:07:16,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:18,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:07:23,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:23,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:24,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:07:24,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 15:07:24,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:07:24,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 15:07:26,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:07:28,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 15:07:31,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:07:34,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:07:34,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 15:07:34,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 15:07:34,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:07:36,379 INFO [train.py:1039] (2/4) Epoch 2, batch 3050, loss[loss=0.3382, simple_loss=0.3632, pruned_loss=0.1567, over 23414.00 frames. ], tot_loss[loss=0.3156, simple_loss=0.3549, pruned_loss=0.1382, over 4715060.40 frames. ], batch size: 285, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:07:37,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:07:39,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:39,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:07:39,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:39,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:07:41,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 15:07:42,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:07:44,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=55746.666666666664, ans=0.1 2023-09-28 15:07:45,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:07:45,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:07:51,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:54,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 15:08:00,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 15:08:02,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 15:08:02,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:04,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:08:07,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:08,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:09,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:12,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=55880.0, ans=0.125 2023-09-28 15:08:12,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=55880.0, ans=0.0 2023-09-28 15:08:13,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:13,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:08:15,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:15,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:15,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:16,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:18,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:21,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:21,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 15:08:21,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:21,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:08:24,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:08:26,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:08:27,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:08:27,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:30,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=55946.666666666664, ans=0.125 2023-09-28 15:08:31,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:33,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:35,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=55946.666666666664, ans=0.2 2023-09-28 15:08:40,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:40,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:08:40,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:43,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:43,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=56013.333333333336, ans=0.125 2023-09-28 15:08:44,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:08:44,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:46,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 15:08:50,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:50,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:51,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 15:08:51,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=56013.333333333336, ans=0.125 2023-09-28 15:08:53,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:57,734 INFO [train.py:1039] (2/4) Epoch 2, batch 3100, loss[loss=0.2988, simple_loss=0.3583, pruned_loss=0.1196, over 24321.00 frames. ], tot_loss[loss=0.3173, simple_loss=0.3554, pruned_loss=0.1396, over 4694265.48 frames. ], batch size: 74, lr: 3.60e-02, grad_scale: 32.0 2023-09-28 15:08:59,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:00,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:09:03,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.748e+02 3.065e+02 3.838e+02 6.915e+02, threshold=6.130e+02, percent-clipped=0.0 2023-09-28 15:09:03,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:09:05,103 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:09:06,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 15:09:09,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 15:09:11,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 15:09:11,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:09:14,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:09:14,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:16,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:09:21,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:27,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 15:09:30,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:09:30,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:32,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:09:32,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:09:33,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:09:35,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:09:35,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 15:09:35,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:09:38,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:38,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 15:09:40,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:09:42,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:09:44,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 15:09:44,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 15:09:46,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:47,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:49,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:09:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:49,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:09:52,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:09:52,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:56,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:09:56,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:09:56,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:56,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:10:02,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:10:04,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 15:10:05,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:10:07,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 15:10:07,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:07,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:07,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 15:10:16,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 15:10:19,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:19,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:19,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=56413.333333333336, ans=0.0 2023-09-28 15:10:20,599 INFO [train.py:1039] (2/4) Epoch 2, batch 3150, loss[loss=0.3187, simple_loss=0.3281, pruned_loss=0.1546, over 22741.00 frames. ], tot_loss[loss=0.3145, simple_loss=0.3528, pruned_loss=0.1381, over 4701028.41 frames. ], batch size: 322, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:10:22,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:10:22,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:10:25,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 15:10:25,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:27,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:10:29,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 15:10:29,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:29,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=56413.333333333336, ans=0.1 2023-09-28 15:10:29,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=56413.333333333336, ans=0.125 2023-09-28 15:10:30,771 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 15:10:35,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 15:10:35,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:10:36,692 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 15:10:38,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:10:39,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 15:10:41,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 15:10:41,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 15:10:41,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:41,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:10:42,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:44,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 15:10:46,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:46,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:48,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:10:50,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:10:51,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=56480.0, ans=0.5 2023-09-28 15:10:53,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 15:10:55,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:10:57,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=56546.666666666664, ans=0.125 2023-09-28 15:10:58,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:10:59,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:10:59,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 15:11:03,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 15:11:05,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:11:05,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:11:05,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:11:05,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:05,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:11:08,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:11:08,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:11:08,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 15:11:10,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:11:10,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:11,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:11:11,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:11:13,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 15:11:13,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:15,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.45 vs. limit=15.0 2023-09-28 15:11:16,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 15:11:16,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:16,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 15:11:18,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 15:11:18,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:11:21,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:22,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 15:11:22,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 15:11:23,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:25,912 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-09-28 15:11:27,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:11:28,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:28,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:11:33,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:11:36,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:40,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 15:11:43,324 INFO [train.py:1039] (2/4) Epoch 2, batch 3200, loss[loss=0.2915, simple_loss=0.3438, pruned_loss=0.1196, over 24495.00 frames. ], tot_loss[loss=0.3127, simple_loss=0.3517, pruned_loss=0.1368, over 4711424.96 frames. ], batch size: 63, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:11:45,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:11:45,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 15:11:49,712 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 2.897e+02 3.504e+02 4.245e+02 7.793e+02, threshold=7.007e+02, percent-clipped=2.0 2023-09-28 15:11:49,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:51,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:11:51,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 15:11:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:59,091 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:12:01,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:12:02,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=56813.333333333336, ans=0.0 2023-09-28 15:12:04,317 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.15 vs. limit=6.0 2023-09-28 15:12:04,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:12:05,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=56813.333333333336, ans=0.125 2023-09-28 15:12:13,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:12:24,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 15:12:25,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:12:28,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 15:12:29,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:12:34,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:12:34,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:12:36,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:12:41,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 15:12:42,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:12:44,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 15:12:47,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 15:12:49,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:12:52,023 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.14 vs. limit=15.0 2023-09-28 15:12:54,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:56,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:12:56,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:57,748 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 15:12:57,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:13:00,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:01,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 15:13:01,579 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=14.84 vs. limit=15.0 2023-09-28 15:13:03,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 15:13:03,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 15:13:04,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 15:13:06,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:13:07,633 INFO [train.py:1039] (2/4) Epoch 2, batch 3250, loss[loss=0.3099, simple_loss=0.3594, pruned_loss=0.1302, over 24515.00 frames. ], tot_loss[loss=0.3116, simple_loss=0.3519, pruned_loss=0.1356, over 4721835.05 frames. ], batch size: 66, lr: 3.58e-02, grad_scale: 32.0 2023-09-28 15:13:10,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:13:10,093 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 15:13:10,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:10,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:11,631 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 15:13:11,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=57080.0, ans=0.125 2023-09-28 15:13:14,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:13:18,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:20,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=57080.0, ans=0.1 2023-09-28 15:13:26,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:13:26,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 15:13:28,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:28,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:13:28,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:29,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:29,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:13:34,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:34,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:13:34,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:36,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:13:40,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:42,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:43,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:44,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:46,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:46,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:46,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:13:49,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 15:13:50,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:50,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:13:53,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:53,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:13:59,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:14:09,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:11,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:11,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 15:14:11,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:14:11,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:14:11,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:11,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=57280.0, ans=0.125 2023-09-28 15:14:13,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=57346.666666666664, ans=0.125 2023-09-28 15:14:15,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 15:14:16,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 15:14:16,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:14:17,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:19,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:19,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:14:21,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:24,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:24,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:27,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 15:14:27,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:31,064 INFO [train.py:1039] (2/4) Epoch 2, batch 3300, loss[loss=0.2831, simple_loss=0.3432, pruned_loss=0.1115, over 24462.00 frames. ], tot_loss[loss=0.3119, simple_loss=0.3528, pruned_loss=0.1355, over 4724392.41 frames. ], batch size: 66, lr: 3.58e-02, grad_scale: 16.0 2023-09-28 15:14:31,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:14:31,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 15:14:34,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:34,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 15:14:36,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 15:14:37,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 15:14:37,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:38,873 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.772e+02 3.522e+02 4.271e+02 9.362e+02, threshold=7.044e+02, percent-clipped=2.0 2023-09-28 15:14:41,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:42,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:14:44,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:46,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:14:46,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:14:49,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:49,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=57480.0, ans=0.125 2023-09-28 15:14:50,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.50 vs. limit=15.0 2023-09-28 15:14:51,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:54,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 15:14:55,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:14:55,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:56,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:56,569 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 15:14:58,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:14:58,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:14:59,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:14:59,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:14:59,639 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 15:15:02,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:02,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:15:05,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:05,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 15:15:06,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 15:15:06,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:08,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:15:08,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=57546.666666666664, ans=0.125 2023-09-28 15:15:09,929 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 15:15:12,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 15:15:13,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=57546.666666666664, ans=0.125 2023-09-28 15:15:14,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:15:16,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 15:15:19,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=57546.666666666664, ans=0.125 2023-09-28 15:15:20,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:22,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:15:22,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:15:25,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:25,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:25,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:27,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:15:29,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:15:29,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:29,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:15:30,841 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 15:15:32,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 15:15:34,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:15:34,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=57613.333333333336, ans=0.125 2023-09-28 15:15:35,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:15:35,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:38,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.98 vs. limit=22.5 2023-09-28 15:15:38,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:38,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:40,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:15:40,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:42,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:15:42,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:45,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:15:48,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 15:15:48,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:48,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=57680.0, ans=0.125 2023-09-28 15:15:49,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:51,817 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.58 vs. limit=15.0 2023-09-28 15:15:53,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:15:53,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:54,718 INFO [train.py:1039] (2/4) Epoch 2, batch 3350, loss[loss=0.2596, simple_loss=0.3104, pruned_loss=0.1043, over 24326.00 frames. ], tot_loss[loss=0.3127, simple_loss=0.3535, pruned_loss=0.136, over 4717234.19 frames. ], batch size: 56, lr: 3.57e-02, grad_scale: 16.0 2023-09-28 15:15:54,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:56,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:56,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:58,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=57746.666666666664, ans=0.0 2023-09-28 15:15:58,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=57746.666666666664, ans=0.1 2023-09-28 15:15:59,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:16:01,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:03,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:16:06,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:07,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:16:09,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:09,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:16:11,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 15:16:13,408 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 15:16:13,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:15,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 15:16:15,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 15:16:16,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:16:18,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:16:18,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:18,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=57813.333333333336, ans=0.125 2023-09-28 15:16:19,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 15:16:19,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:19,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:16:21,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:24,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:25,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:28,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:16:29,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=57880.0, ans=0.125 2023-09-28 15:16:31,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:34,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:34,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:39,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:16:40,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:43,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:43,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:46,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:48,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 15:16:48,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:16:48,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 15:16:48,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:16:50,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 15:16:51,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:53,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:56,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=57946.666666666664, ans=0.125 2023-09-28 15:17:00,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:01,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 15:17:03,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:04,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:17:06,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:17:06,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=58013.333333333336, ans=0.0 2023-09-28 15:17:12,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:13,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 15:17:13,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:17:13,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:17:14,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:15,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 15:17:16,503 INFO [train.py:1039] (2/4) Epoch 2, batch 3400, loss[loss=0.3126, simple_loss=0.3666, pruned_loss=0.1293, over 24360.00 frames. ], tot_loss[loss=0.3134, simple_loss=0.354, pruned_loss=0.1364, over 4720937.02 frames. ], batch size: 77, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:17:16,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:16,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 15:17:18,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:18,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:19,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:17:21,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:17:21,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 15:17:24,561 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.046e+02 2.787e+02 3.091e+02 3.869e+02 5.571e+02, threshold=6.183e+02, percent-clipped=0.0 2023-09-28 15:17:26,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 15:17:26,265 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 15:17:26,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:17:30,273 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.97 vs. limit=6.0 2023-09-28 15:17:30,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:30,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:31,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:31,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=58146.666666666664, ans=0.125 2023-09-28 15:17:32,065 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=6.60 vs. limit=15.0 2023-09-28 15:17:33,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:17:38,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:17:41,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 15:17:43,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=58146.666666666664, ans=0.04949747468305833 2023-09-28 15:17:45,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=58146.666666666664, ans=10.0 2023-09-28 15:17:47,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:17:49,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=58213.333333333336, ans=0.125 2023-09-28 15:17:50,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:50,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:51,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:17:59,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:18:03,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 15:18:08,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=58280.0, ans=0.0 2023-09-28 15:18:10,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 15:18:10,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:12,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:14,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:18:14,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:18:17,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:18:20,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:18:20,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:18:25,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:25,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 15:18:32,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:18:35,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=58346.666666666664, ans=0.125 2023-09-28 15:18:37,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 15:18:37,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=58413.333333333336, ans=0.0 2023-09-28 15:18:38,619 INFO [train.py:1039] (2/4) Epoch 2, batch 3450, loss[loss=0.2653, simple_loss=0.3227, pruned_loss=0.1039, over 24484.00 frames. ], tot_loss[loss=0.3124, simple_loss=0.3536, pruned_loss=0.1356, over 4731648.56 frames. ], batch size: 58, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:18:41,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 15:18:41,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:43,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:18:43,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 15:18:46,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:49,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:18:55,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:18:56,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:18:57,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:18:57,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:59,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:06,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 15:19:10,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 15:19:10,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:19:10,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:19:13,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:20,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 15:19:21,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:19:25,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:25,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:19:27,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:19:28,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:19:30,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 15:19:30,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:19:32,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:35,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:19:37,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 15:19:42,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:19:47,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:19:47,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:52,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:19:57,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:57,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:57,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:19:58,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:20:01,743 INFO [train.py:1039] (2/4) Epoch 2, batch 3500, loss[loss=0.2743, simple_loss=0.2963, pruned_loss=0.1262, over 22685.00 frames. ], tot_loss[loss=0.3113, simple_loss=0.352, pruned_loss=0.1353, over 4714474.07 frames. ], batch size: 322, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:20:02,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:04,948 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.75 vs. limit=12.0 2023-09-28 15:20:06,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:20:06,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 15:20:08,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:20:09,931 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.839e+02 3.369e+02 4.173e+02 9.194e+02, threshold=6.738e+02, percent-clipped=6.0 2023-09-28 15:20:11,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:20:13,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:13,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 15:20:18,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:20:20,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:20:22,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:20:22,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:24,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:20:24,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:24,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:25,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 15:20:30,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:32,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:20:33,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:37,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=58880.0, ans=0.125 2023-09-28 15:20:38,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:39,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 15:20:39,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:43,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:45,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:20:46,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:48,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:20:48,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:52,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 15:20:52,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 15:20:52,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 15:20:53,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:55,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:56,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:57,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:21:01,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:21:01,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:21:06,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:07,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 15:21:07,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 15:21:07,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:11,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:12,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:13,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=59013.333333333336, ans=0.0 2023-09-28 15:21:14,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:16,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 15:21:16,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:19,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:21:19,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 15:21:21,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=59013.333333333336, ans=0.125 2023-09-28 15:21:21,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=25.99 vs. limit=15.0 2023-09-28 15:21:22,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 15:21:24,845 INFO [train.py:1039] (2/4) Epoch 2, batch 3550, loss[loss=0.3063, simple_loss=0.3581, pruned_loss=0.1272, over 24355.00 frames. ], tot_loss[loss=0.308, simple_loss=0.3493, pruned_loss=0.1334, over 4722450.17 frames. ], batch size: 77, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:21:24,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:27,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:28,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:28,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:31,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:21:41,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:42,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=59146.666666666664, ans=0.125 2023-09-28 15:21:43,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 15:21:46,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:48,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:21:49,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:51,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:21:51,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:21:54,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:54,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:21:55,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:55,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:21:57,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:21:59,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=59213.333333333336, ans=0.04949747468305833 2023-09-28 15:22:02,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:22:02,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:22:03,554 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.91 vs. limit=6.0 2023-09-28 15:22:05,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:05,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:05,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:22:05,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 15:22:05,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:07,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:08,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:22:14,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:14,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:22:16,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:17,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 15:22:18,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:22:20,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 15:22:20,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:21,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:22:23,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:22:23,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=59280.0, ans=0.125 2023-09-28 15:22:24,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 15:22:26,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:29,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=59346.666666666664, ans=0.125 2023-09-28 15:22:33,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:33,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 15:22:34,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:38,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:39,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 15:22:39,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=59346.666666666664, ans=0.125 2023-09-28 15:22:46,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=59413.333333333336, ans=0.1 2023-09-28 15:22:48,090 INFO [train.py:1039] (2/4) Epoch 2, batch 3600, loss[loss=0.3253, simple_loss=0.355, pruned_loss=0.1478, over 23790.00 frames. ], tot_loss[loss=0.3082, simple_loss=0.3488, pruned_loss=0.1338, over 4703074.12 frames. ], batch size: 164, lr: 3.54e-02, grad_scale: 32.0 2023-09-28 15:22:48,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 15:22:48,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:22:49,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:22:51,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:51,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:53,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:22:53,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=59413.333333333336, ans=0.2 2023-09-28 15:22:56,190 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.598e+02 2.903e+02 3.548e+02 6.359e+02, threshold=5.806e+02, percent-clipped=0.0 2023-09-28 15:22:57,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:59,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:00,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:23:01,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:23:02,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:02,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 15:23:08,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:23:09,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:12,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:13,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:15,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:23:15,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:23:15,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 15:23:16,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:18,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:20,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:23:23,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:24,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=59546.666666666664, ans=0.125 2023-09-28 15:23:25,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:25,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:23:27,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 15:23:32,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=59546.666666666664, ans=0.125 2023-09-28 15:23:32,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=59546.666666666664, ans=0.2 2023-09-28 15:23:35,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:23:36,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:23:36,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 15:23:43,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:23:48,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:51,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:58,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:23:58,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:23:58,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 15:24:00,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 15:24:01,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 15:24:05,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:24:05,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:24:06,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 15:24:06,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:08,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:24:08,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:08,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 15:24:09,719 INFO [train.py:1039] (2/4) Epoch 2, batch 3650, loss[loss=0.3046, simple_loss=0.3389, pruned_loss=0.1352, over 23719.00 frames. ], tot_loss[loss=0.3074, simple_loss=0.3489, pruned_loss=0.133, over 4703293.65 frames. ], batch size: 164, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:24:09,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 15:24:14,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:24:14,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 15:24:19,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 15:24:20,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=59746.666666666664, ans=0.125 2023-09-28 15:24:21,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:24:24,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 15:24:26,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 15:24:31,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:24:31,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:24:33,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:24:36,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:24:36,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:36,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 15:24:38,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:24:39,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:39,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 15:24:40,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:24:41,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:24:41,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:43,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:24:46,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 15:24:47,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 15:24:47,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:24:49,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 15:24:50,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:24:51,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:24:57,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:24:59,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:59,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:25:02,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:25:04,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:25:08,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:25:09,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:11,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:11,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:25:15,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:25:16,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:25:16,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:24,094 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 15:25:27,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:27,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:27,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:25:29,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:31,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:25:32,594 INFO [train.py:1039] (2/4) Epoch 2, batch 3700, loss[loss=0.3114, simple_loss=0.3402, pruned_loss=0.1413, over 23696.00 frames. ], tot_loss[loss=0.3082, simple_loss=0.3498, pruned_loss=0.1333, over 4711961.22 frames. ], batch size: 164, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:25:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:32,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=60080.0, ans=0.1 2023-09-28 15:25:32,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=60080.0, ans=0.2 2023-09-28 15:25:34,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 15:25:34,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:37,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:25:39,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:41,575 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.121e+02 2.788e+02 3.403e+02 4.126e+02 8.216e+02, threshold=6.806e+02, percent-clipped=7.0 2023-09-28 15:25:41,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:25:42,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=60080.0, ans=0.125 2023-09-28 15:25:43,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:43,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 15:25:43,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:43,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:25:45,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:25:46,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:25:50,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:51,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:51,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:25:52,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=60146.666666666664, ans=0.125 2023-09-28 15:25:53,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:53,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:25:55,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:58,168 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 15:26:06,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:26:08,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:26:09,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:26:09,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 15:26:09,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:14,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:14,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 15:26:17,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:18,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:26:23,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:23,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:26:25,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:26:29,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:29,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 15:26:31,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:26:31,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 15:26:36,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:26:36,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.95 vs. limit=15.0 2023-09-28 15:26:37,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:26:41,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:41,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 15:26:44,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:26:44,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:26:44,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:44,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:47,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:47,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 15:26:49,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 15:26:50,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:26:50,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:26:52,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:26:54,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:26:55,362 INFO [train.py:1039] (2/4) Epoch 2, batch 3750, loss[loss=0.3308, simple_loss=0.3658, pruned_loss=0.1479, over 23620.00 frames. ], tot_loss[loss=0.3097, simple_loss=0.3508, pruned_loss=0.1343, over 4708149.38 frames. ], batch size: 149, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:26:55,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=60413.333333333336, ans=0.1 2023-09-28 15:26:57,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:57,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=60413.333333333336, ans=0.125 2023-09-28 15:26:58,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:27:00,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:02,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 15:27:03,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:27:06,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:27:06,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 15:27:08,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:27:08,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:09,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:11,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:27:11,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=60480.0, ans=0.04949747468305833 2023-09-28 15:27:15,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:17,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:27:20,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:27:20,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=60480.0, ans=0.125 2023-09-28 15:27:24,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:27:27,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:28,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 15:27:30,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:31,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:31,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:34,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 15:27:39,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 15:27:41,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:41,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:43,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:48,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:50,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:27:53,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 15:27:57,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:00,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:28:00,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:28:03,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:28:08,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:28:10,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:28:13,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:28:15,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:28:18,196 INFO [train.py:1039] (2/4) Epoch 2, batch 3800, loss[loss=0.3106, simple_loss=0.359, pruned_loss=0.1311, over 23729.00 frames. ], tot_loss[loss=0.309, simple_loss=0.351, pruned_loss=0.1335, over 4710734.47 frames. ], batch size: 85, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:28:18,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:28:25,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:28:26,487 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.661e+02 3.070e+02 3.841e+02 5.617e+02, threshold=6.140e+02, percent-clipped=0.0 2023-09-28 15:28:30,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:30,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:28:32,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 15:28:35,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:36,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:38,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:28:40,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:28:40,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:40,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:28:41,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:41,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:28:43,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:44,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 15:28:45,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=60813.333333333336, ans=0.125 2023-09-28 15:28:48,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=60813.333333333336, ans=0.2 2023-09-28 15:28:49,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 15:28:49,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:28:49,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:54,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:28:54,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:28:56,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:28:56,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:58,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:59,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:29:05,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:29:06,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 15:29:08,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:16,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:20,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=60946.666666666664, ans=0.0 2023-09-28 15:29:22,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:29:23,645 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.25 vs. limit=15.0 2023-09-28 15:29:24,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 15:29:25,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 15:29:26,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.18 vs. limit=15.0 2023-09-28 15:29:27,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:29:29,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:29,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:32,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 15:29:34,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 15:29:34,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 15:29:34,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:34,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=61013.333333333336, ans=0.0 2023-09-28 15:29:35,189 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.55 vs. limit=12.0 2023-09-28 15:29:36,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:41,183 INFO [train.py:1039] (2/4) Epoch 2, batch 3850, loss[loss=0.3358, simple_loss=0.3513, pruned_loss=0.1602, over 23918.00 frames. ], tot_loss[loss=0.309, simple_loss=0.3503, pruned_loss=0.1339, over 4704989.98 frames. ], batch size: 195, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:29:41,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:29:41,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:29:43,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=61080.0, ans=0.1 2023-09-28 15:29:48,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=61080.0, ans=0.1 2023-09-28 15:29:49,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:29:49,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 15:29:51,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:29:52,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:53,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=61080.0, ans=0.1 2023-09-28 15:29:55,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:29:57,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:01,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:30:02,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 15:30:08,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:10,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:30:14,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:14,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:30:17,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:19,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:30:19,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:19,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:30:21,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:23,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:24,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:24,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:30:24,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 15:30:24,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 15:30:25,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:25,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:28,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:28,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:29,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 15:30:31,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 15:30:34,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:37,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 15:30:39,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:30:44,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:45,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:50,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:50,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 15:30:54,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 15:30:56,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:57,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:59,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:31:00,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:31:02,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:31:02,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 15:31:03,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:31:03,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 15:31:05,396 INFO [train.py:1039] (2/4) Epoch 2, batch 3900, loss[loss=0.3267, simple_loss=0.3678, pruned_loss=0.1428, over 23719.00 frames. ], tot_loss[loss=0.307, simple_loss=0.3477, pruned_loss=0.1332, over 4689226.38 frames. ], batch size: 85, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:31:05,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:05,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:09,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:31:09,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:10,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:31:11,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.51 vs. limit=22.5 2023-09-28 15:31:12,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:12,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:31:13,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:13,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 15:31:14,948 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.111e+02 3.017e+02 3.758e+02 4.866e+02 8.103e+02, threshold=7.517e+02, percent-clipped=9.0 2023-09-28 15:31:15,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:19,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:19,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:19,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:31:21,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:22,408 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.54 vs. limit=15.0 2023-09-28 15:31:23,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:23,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:25,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:31:25,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 15:31:25,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:25,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=61480.0, ans=0.0 2023-09-28 15:31:29,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 15:31:29,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:30,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 15:31:32,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 15:31:37,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:37,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:37,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:31:37,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:31:44,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:45,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:31:47,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:31:47,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:31:48,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:31:54,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:56,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:32:03,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:32:06,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:32:15,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:18,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:18,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 15:32:18,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 15:32:18,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:21,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 15:32:22,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:32:23,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 15:32:27,646 INFO [train.py:1039] (2/4) Epoch 2, batch 3950, loss[loss=0.2762, simple_loss=0.3331, pruned_loss=0.1096, over 24461.00 frames. ], tot_loss[loss=0.3058, simple_loss=0.3464, pruned_loss=0.1326, over 4689988.25 frames. ], batch size: 63, lr: 3.50e-02, grad_scale: 16.0 2023-09-28 15:32:30,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:32:32,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 15:32:32,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:32:33,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=61746.666666666664, ans=0.125 2023-09-28 15:32:36,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:32:36,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:32:37,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=61746.666666666664, ans=0.04949747468305833 2023-09-28 15:32:42,750 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 15:32:42,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:42,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 15:32:44,383 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 15:32:44,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:47,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:47,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:32:47,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:51,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 15:32:54,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:32:54,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:54,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:32:55,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:32:55,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:33:10,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:33:10,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:33:15,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 15:33:21,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 15:33:21,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 15:33:21,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:33:21,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:33:31,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:33:31,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:33:31,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:33:31,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:33:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 15:33:37,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:33:37,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:33:43,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 15:33:50,908 INFO [train.py:1039] (2/4) Epoch 2, batch 4000, loss[loss=0.3146, simple_loss=0.3515, pruned_loss=0.1389, over 23616.00 frames. ], tot_loss[loss=0.3075, simple_loss=0.348, pruned_loss=0.1335, over 4697230.87 frames. ], batch size: 120, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:33:53,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:00,521 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.115e+02 2.667e+02 3.102e+02 3.739e+02 5.797e+02, threshold=6.204e+02, percent-clipped=0.0 2023-09-28 15:34:02,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:02,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=62080.0, ans=0.125 2023-09-28 15:34:05,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:05,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:34:06,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:06,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 15:34:08,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:34:08,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 15:34:08,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:34:08,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 15:34:10,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:15,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:34:15,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:34:15,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:34:15,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:15,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:34:17,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:34:19,448 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 15:34:20,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:34:20,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:24,109 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 15:34:24,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:34:24,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:33,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 15:34:33,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:35,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:34:37,105 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 15:34:38,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:34:38,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 15:34:38,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:34:39,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=62280.0, ans=0.125 2023-09-28 15:34:41,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:41,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:34:43,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:34:45,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:34:45,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:47,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 15:34:47,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:47,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.91 vs. limit=15.0 2023-09-28 15:34:50,122 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 15:34:53,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:34:57,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:34:59,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:34:59,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:01,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:35:01,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:01,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=62346.666666666664, ans=0.1 2023-09-28 15:35:07,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:11,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:35:11,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 15:35:12,609 INFO [train.py:1039] (2/4) Epoch 2, batch 4050, loss[loss=0.3272, simple_loss=0.3661, pruned_loss=0.1441, over 23831.00 frames. ], tot_loss[loss=0.3101, simple_loss=0.3502, pruned_loss=0.135, over 4687095.74 frames. ], batch size: 179, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:35:14,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:35:14,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:15,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:35:17,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:17,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:22,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:24,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:35:25,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:35:27,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:35:27,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:35:32,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:34,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 15:35:38,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 15:35:38,856 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 15:35:41,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:35:48,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 15:35:50,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:35:53,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.13 vs. limit=6.0 2023-09-28 15:35:54,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:57,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:57,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:35:57,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:36:00,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:36:03,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 15:36:03,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:36:05,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=62613.333333333336, ans=0.125 2023-09-28 15:36:07,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:09,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 15:36:13,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:16,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=62613.333333333336, ans=0.07 2023-09-28 15:36:17,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=62680.0, ans=0.0 2023-09-28 15:36:24,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 15:36:25,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:36:25,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:36:25,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 15:36:25,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 15:36:25,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:26,218 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:36:26,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.26 vs. limit=10.0 2023-09-28 15:36:29,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:36:31,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:31,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:36:35,787 INFO [train.py:1039] (2/4) Epoch 2, batch 4100, loss[loss=0.2952, simple_loss=0.3463, pruned_loss=0.122, over 24411.00 frames. ], tot_loss[loss=0.3116, simple_loss=0.3513, pruned_loss=0.1359, over 4680664.35 frames. ], batch size: 63, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:36:39,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 15:36:39,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 15:36:40,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=62746.666666666664, ans=0.0 2023-09-28 15:36:42,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 15:36:44,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 15:36:44,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:44,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:36:45,980 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 15:36:47,350 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.099e+02 2.677e+02 3.262e+02 4.112e+02 6.784e+02, threshold=6.525e+02, percent-clipped=2.0 2023-09-28 15:36:49,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:49,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:36:49,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:51,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:36:56,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:36:58,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:58,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:36:58,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 15:36:59,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:59,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:36:59,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:01,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:37:01,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 15:37:06,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:07,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 15:37:07,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:37:08,104 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:37:12,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:12,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 15:37:13,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:37:13,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:37:13,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:37:15,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 15:37:19,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:37:19,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:37:22,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 15:37:23,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:37:23,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:27,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:27,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=62946.666666666664, ans=0.0 2023-09-28 15:37:29,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=62946.666666666664, ans=0.1 2023-09-28 15:37:32,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:37:35,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:35,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:37:41,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=63013.333333333336, ans=0.125 2023-09-28 15:37:46,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:37:46,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:47,749 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.86 vs. limit=22.5 2023-09-28 15:37:48,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:51,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:37:55,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=63013.333333333336, ans=0.125 2023-09-28 15:37:58,058 INFO [train.py:1039] (2/4) Epoch 2, batch 4150, loss[loss=0.316, simple_loss=0.3426, pruned_loss=0.1447, over 23872.00 frames. ], tot_loss[loss=0.3108, simple_loss=0.3505, pruned_loss=0.1355, over 4688886.14 frames. ], batch size: 195, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:37:58,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:58,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:37:59,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:37:59,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:03,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 15:38:03,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:03,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 15:38:05,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 15:38:05,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 15:38:06,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:11,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:38:11,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:12,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=63080.0, ans=0.2 2023-09-28 15:38:15,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:17,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:38:18,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:38:20,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:38:20,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:21,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:38:22,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=63146.666666666664, ans=0.0 2023-09-28 15:38:25,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:31,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 15:38:34,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 15:38:34,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:38:36,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 15:38:36,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:38:36,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:38:39,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:38:39,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:45,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 15:38:49,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:38:50,054 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.42 vs. limit=15.0 2023-09-28 15:38:51,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:38:52,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 15:38:52,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=63280.0, ans=0.1 2023-09-28 15:38:53,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:55,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 15:38:55,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:38:58,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:38:59,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:38:59,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 15:38:59,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:38:59,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:39:03,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:39:06,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 15:39:07,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:07,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:39:07,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:39:07,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 15:39:09,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:39:09,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:39:10,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:39:12,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=63346.666666666664, ans=0.125 2023-09-28 15:39:14,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:14,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 15:39:14,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:39:14,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=63346.666666666664, ans=0.0 2023-09-28 15:39:19,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:39:21,070 INFO [train.py:1039] (2/4) Epoch 2, batch 4200, loss[loss=0.3176, simple_loss=0.3528, pruned_loss=0.1412, over 23288.00 frames. ], tot_loss[loss=0.3083, simple_loss=0.3495, pruned_loss=0.1336, over 4702412.17 frames. ], batch size: 105, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:39:21,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 15:39:24,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:39:25,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=63413.333333333336, ans=0.125 2023-09-28 15:39:26,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:27,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:39:27,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:27,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:30,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 15:39:32,118 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.187e+02 2.868e+02 3.365e+02 4.143e+02 5.998e+02, threshold=6.730e+02, percent-clipped=0.0 2023-09-28 15:39:33,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 15:39:33,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:37,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:39,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:39:42,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:39:46,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:39:46,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:46,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=63480.0, ans=0.0 2023-09-28 15:39:47,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 15:39:47,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:49,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:49,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:49,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:39:52,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:39:55,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 15:39:55,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:59,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:39:59,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=63546.666666666664, ans=0.125 2023-09-28 15:40:01,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:40:02,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:40:04,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:07,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:40:07,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 15:40:07,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:08,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:40:14,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:40:17,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:22,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:40:25,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 15:40:29,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:34,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:40:34,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:37,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 15:40:40,161 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.79 vs. limit=15.0 2023-09-28 15:40:42,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:40:43,866 INFO [train.py:1039] (2/4) Epoch 2, batch 4250, loss[loss=0.2711, simple_loss=0.3225, pruned_loss=0.1098, over 24596.00 frames. ], tot_loss[loss=0.3069, simple_loss=0.3483, pruned_loss=0.1328, over 4703942.81 frames. ], batch size: 60, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:40:45,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:45,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:40:47,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:54,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:40:55,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 15:40:55,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:58,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:59,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=63746.666666666664, ans=0.125 2023-09-28 15:41:01,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:05,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:07,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:08,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:41:08,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:10,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:10,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:11,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:13,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:41:15,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:16,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 15:41:20,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 15:41:21,643 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.59 vs. limit=8.0 2023-09-28 15:41:21,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:22,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:22,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:23,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:41:23,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:23,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:28,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:41:29,580 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.03 vs. limit=15.0 2023-09-28 15:41:30,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:41:34,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:41:35,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:35,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 15:41:35,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:41:37,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 15:41:38,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:41:40,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:41:40,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=63946.666666666664, ans=0.0 2023-09-28 15:41:43,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:43,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:45,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 15:41:45,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=63946.666666666664, ans=0.125 2023-09-28 15:41:46,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:41:48,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:41:51,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:55,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:56,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:41:59,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:41:59,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=64013.333333333336, ans=0.1 2023-09-28 15:42:00,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:02,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:42:02,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:02,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 15:42:05,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:08,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=64080.0, ans=15.0 2023-09-28 15:42:09,184 INFO [train.py:1039] (2/4) Epoch 2, batch 4300, loss[loss=0.3203, simple_loss=0.3522, pruned_loss=0.1442, over 23510.00 frames. ], tot_loss[loss=0.3063, simple_loss=0.3476, pruned_loss=0.1325, over 4683043.03 frames. ], batch size: 134, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:42:12,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:12,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:15,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:17,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=64080.0, ans=0.0 2023-09-28 15:42:19,710 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.736e+02 3.234e+02 3.981e+02 6.423e+02, threshold=6.467e+02, percent-clipped=0.0 2023-09-28 15:42:23,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:42:23,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 15:42:24,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:42:26,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:42:26,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:42:26,282 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 15:42:29,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:42:32,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:42:36,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 15:42:36,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:42:36,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 15:42:40,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:42:42,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:42:45,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:42:45,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:47,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:42:47,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=64213.333333333336, ans=0.1 2023-09-28 15:42:48,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:48,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:48,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 15:42:50,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 15:42:53,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:42:56,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:42:56,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:56,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 15:42:56,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 15:42:56,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=64280.0, ans=0.0 2023-09-28 15:42:57,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 15:42:59,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:43:00,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 15:43:00,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 15:43:06,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:08,192 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 15:43:10,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:43:12,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:12,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:14,275 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.21 vs. limit=15.0 2023-09-28 15:43:15,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 15:43:17,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:43:17,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:17,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:19,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:19,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:43:21,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=64346.666666666664, ans=0.2 2023-09-28 15:43:22,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:43:25,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:26,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:26,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:27,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.59 vs. limit=15.0 2023-09-28 15:43:29,617 INFO [train.py:1039] (2/4) Epoch 2, batch 4350, loss[loss=0.2821, simple_loss=0.3258, pruned_loss=0.1192, over 23719.00 frames. ], tot_loss[loss=0.3067, simple_loss=0.3482, pruned_loss=0.1326, over 4691779.47 frames. ], batch size: 149, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:43:32,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 15:43:33,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.04 vs. limit=15.0 2023-09-28 15:43:34,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:43:36,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=64413.333333333336, ans=0.125 2023-09-28 15:43:38,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:43:40,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:44,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:43:44,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:43:49,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:43:53,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:55,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:43:55,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:59,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:44:02,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:44:04,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:44:06,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=64546.666666666664, ans=0.0 2023-09-28 15:44:09,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 15:44:11,467 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.05 vs. limit=15.0 2023-09-28 15:44:11,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:12,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:16,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:20,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 15:44:22,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:24,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:44:31,242 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 15:44:32,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:32,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:44:32,878 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 15:44:34,347 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 15:44:34,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:34,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:35,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:44:37,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:37,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:38,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:44:40,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 15:44:40,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:40,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:40,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:42,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 15:44:42,308 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 15:44:42,316 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 15:44:42,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 15:44:46,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:44:46,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:44:46,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:44:48,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:44:50,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 15:44:51,882 INFO [train.py:1039] (2/4) Epoch 2, batch 4400, loss[loss=0.3276, simple_loss=0.3566, pruned_loss=0.1493, over 23867.00 frames. ], tot_loss[loss=0.307, simple_loss=0.3487, pruned_loss=0.1326, over 4698577.45 frames. ], batch size: 212, lr: 3.45e-02, grad_scale: 32.0 2023-09-28 15:44:52,093 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 15:44:52,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:56,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:44:56,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:58,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:45:00,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 15:45:02,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 15:45:02,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 15:45:02,415 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 15:45:03,804 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.169e+02 2.849e+02 3.157e+02 3.871e+02 7.582e+02, threshold=6.315e+02, percent-clipped=2.0 2023-09-28 15:45:03,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:45:03,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:45:05,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 15:45:05,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=64746.666666666664, ans=0.125 2023-09-28 15:45:08,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:10,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:10,156 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 15:45:13,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:13,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 15:45:13,303 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 15:45:15,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.50 vs. limit=15.0 2023-09-28 15:45:17,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 15:45:17,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 15:45:17,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 15:45:19,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:19,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:22,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 15:45:22,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 15:45:22,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:26,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:45:26,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:27,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:28,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=64880.0, ans=0.125 2023-09-28 15:45:29,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:29,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 15:45:29,460 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 15:45:33,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:43,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 15:45:48,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:45:51,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:45:54,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:45:54,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 15:45:54,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:45:54,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:45:54,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:45:55,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:46:00,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 15:46:04,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 15:46:05,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 15:46:05,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:05,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 15:46:08,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:46:12,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:46:14,522 INFO [train.py:1039] (2/4) Epoch 2, batch 4450, loss[loss=0.3383, simple_loss=0.3839, pruned_loss=0.1463, over 24329.00 frames. ], tot_loss[loss=0.3084, simple_loss=0.35, pruned_loss=0.1334, over 4696545.46 frames. ], batch size: 77, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:46:14,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 15:46:17,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:46:20,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:46:24,602 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=17.26 vs. limit=22.5 2023-09-28 15:46:29,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:46:29,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:46:34,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:36,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=65146.666666666664, ans=15.0 2023-09-28 15:46:37,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:46:38,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=65146.666666666664, ans=0.0 2023-09-28 15:46:39,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:46:41,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:42,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 15:46:42,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:42,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:43,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:46:43,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:46:43,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=65146.666666666664, ans=0.0 2023-09-28 15:46:47,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:46:50,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:51,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:52,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=65213.333333333336, ans=0.09899494936611666 2023-09-28 15:46:53,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:53,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:55,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:46:58,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:46:59,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 15:46:59,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 15:46:59,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:47:01,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:02,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 15:47:05,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.35 vs. limit=10.0 2023-09-28 15:47:07,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:47:09,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:11,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 15:47:11,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:11,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:11,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:47:11,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:14,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:17,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=65280.0, ans=0.125 2023-09-28 15:47:17,243 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.48 vs. limit=15.0 2023-09-28 15:47:20,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:47:21,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 15:47:23,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:47:26,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:47:26,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:28,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:28,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:47:31,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:47:35,422 INFO [train.py:1039] (2/4) Epoch 2, batch 4500, loss[loss=0.2549, simple_loss=0.3027, pruned_loss=0.1035, over 21187.00 frames. ], tot_loss[loss=0.3068, simple_loss=0.3495, pruned_loss=0.132, over 4704341.93 frames. ], batch size: 46, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:47:35,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 15:47:35,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=65413.333333333336, ans=0.0 2023-09-28 15:47:37,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:47:41,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:43,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 15:47:43,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 15:47:44,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:47:46,456 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.128e+02 2.918e+02 3.364e+02 4.065e+02 7.320e+02, threshold=6.729e+02, percent-clipped=3.0 2023-09-28 15:47:50,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:50,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:51,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:47:53,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:47:53,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:47:53,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:48:06,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:48:06,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:48:09,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:11,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:48:11,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:48:16,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:48:22,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:48:26,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:48:28,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:48:29,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 15:48:29,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:31,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:33,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:33,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:35,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.04 vs. limit=15.0 2023-09-28 15:48:36,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:48:36,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 15:48:36,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:48:36,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:42,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:48:42,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:48:44,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:45,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:48:46,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:48:47,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 15:48:49,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 15:48:51,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 15:48:56,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 15:48:57,867 INFO [train.py:1039] (2/4) Epoch 2, batch 4550, loss[loss=0.272, simple_loss=0.3206, pruned_loss=0.1117, over 24306.00 frames. ], tot_loss[loss=0.3065, simple_loss=0.3486, pruned_loss=0.1322, over 4699888.67 frames. ], batch size: 56, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:48:57,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 15:48:58,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:01,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:01,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:04,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=65746.66666666667, ans=0.125 2023-09-28 15:49:06,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:11,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:49:13,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:49:13,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=65813.33333333333, ans=0.0 2023-09-28 15:49:16,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:16,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:49:16,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:19,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:19,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:23,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:49:26,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 15:49:27,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 15:49:27,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:49:29,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 15:49:33,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 15:49:34,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:37,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 15:49:38,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:49:41,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:49:45,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 15:49:47,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:49:49,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:49,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:51,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:52,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 15:49:52,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 15:49:54,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:49:54,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 15:49:56,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 15:49:58,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:58,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:58,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:00,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:00,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=65946.66666666667, ans=0.2 2023-09-28 15:50:01,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:50:01,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:50:03,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 15:50:06,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:50:06,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:50:06,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 15:50:06,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:50:06,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 15:50:10,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:50:11,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:50:13,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:50:13,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:13,819 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=15.0 2023-09-28 15:50:14,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:50:17,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:50:17,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=66080.0, ans=0.1 2023-09-28 15:50:18,886 INFO [train.py:1039] (2/4) Epoch 2, batch 4600, loss[loss=0.2985, simple_loss=0.3464, pruned_loss=0.1253, over 23421.00 frames. ], tot_loss[loss=0.3051, simple_loss=0.3475, pruned_loss=0.1313, over 4704793.83 frames. ], batch size: 93, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:50:19,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:50:20,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:22,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:24,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=66080.0, ans=0.2 2023-09-28 15:50:25,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:50:25,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:50:26,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:28,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 15:50:29,709 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.215e+02 2.622e+02 3.070e+02 3.813e+02 6.355e+02, threshold=6.141e+02, percent-clipped=0.0 2023-09-28 15:50:29,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:50:33,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:50:33,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:35,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:43,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 15:50:45,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:48,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:51,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:50:51,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:57,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 15:50:57,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:50:57,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:50:57,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=66213.33333333333, ans=0.1 2023-09-28 15:51:01,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=66213.33333333333, ans=0.2 2023-09-28 15:51:04,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:05,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:51:06,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:51:11,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 15:51:12,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:51:16,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:16,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:51:18,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=66280.0, ans=0.0 2023-09-28 15:51:20,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:20,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 15:51:20,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:22,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 15:51:22,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:23,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:23,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=66346.66666666667, ans=0.125 2023-09-28 15:51:24,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:26,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:51:27,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:27,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 15:51:29,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 15:51:29,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 15:51:29,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:32,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:32,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:33,433 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.64 vs. limit=15.0 2023-09-28 15:51:33,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:34,437 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:51:41,367 INFO [train.py:1039] (2/4) Epoch 2, batch 4650, loss[loss=0.2991, simple_loss=0.3552, pruned_loss=0.1214, over 24371.00 frames. ], tot_loss[loss=0.3041, simple_loss=0.3468, pruned_loss=0.1307, over 4705143.07 frames. ], batch size: 77, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:51:44,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:51:47,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:51:49,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:49,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:51:50,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:50,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:50,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:56,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 15:51:58,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:51:59,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 15:51:59,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:52:01,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 15:52:01,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:52:02,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 15:52:02,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 15:52:02,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:04,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:52:04,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=66480.0, ans=0.2 2023-09-28 15:52:07,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:52:08,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:08,992 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 15:52:09,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=66480.0, ans=0.0 2023-09-28 15:52:14,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:16,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 15:52:18,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:18,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:52:19,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 15:52:21,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:52:23,522 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.99 vs. limit=22.5 2023-09-28 15:52:24,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:52:25,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.79 vs. limit=22.5 2023-09-28 15:52:29,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:31,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=66613.33333333333, ans=0.2 2023-09-28 15:52:33,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:36,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:36,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:36,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:52:39,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 15:52:39,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 15:52:39,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 15:52:39,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 15:52:39,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=66613.33333333333, ans=0.0 2023-09-28 15:52:42,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:52:48,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:52:48,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:52:48,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 15:52:50,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:51,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:51,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:52:53,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:52:55,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:52:55,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:55,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:53:01,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:01,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:53:01,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:53:03,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 15:53:03,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:53:05,186 INFO [train.py:1039] (2/4) Epoch 2, batch 4700, loss[loss=0.337, simple_loss=0.3727, pruned_loss=0.1507, over 23150.00 frames. ], tot_loss[loss=0.3067, simple_loss=0.3484, pruned_loss=0.1325, over 4697409.95 frames. ], batch size: 105, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:53:06,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 15:53:07,675 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.05 vs. limit=15.0 2023-09-28 15:53:14,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:15,872 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.037e+02 2.802e+02 3.291e+02 3.873e+02 6.346e+02, threshold=6.582e+02, percent-clipped=1.0 2023-09-28 15:53:15,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:16,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:53:18,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:19,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:53:26,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 15:53:26,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 15:53:29,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:30,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:53:30,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:53:34,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:40,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:53:41,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:53:44,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:51,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 15:53:54,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:53:55,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:53:56,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=66946.66666666667, ans=0.125 2023-09-28 15:53:59,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 15:54:00,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:04,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.77 vs. limit=15.0 2023-09-28 15:54:05,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:54:06,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 15:54:07,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:09,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:10,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:54:10,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:54:10,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 15:54:12,390 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 15:54:13,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:14,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=67013.33333333333, ans=0.125 2023-09-28 15:54:17,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 15:54:18,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:22,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 15:54:23,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:54:24,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:27,487 INFO [train.py:1039] (2/4) Epoch 2, batch 4750, loss[loss=0.314, simple_loss=0.3625, pruned_loss=0.1328, over 24153.00 frames. ], tot_loss[loss=0.3062, simple_loss=0.3487, pruned_loss=0.1318, over 4702052.22 frames. ], batch size: 80, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:54:31,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:31,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:54:32,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 15:54:34,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:54:38,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 15:54:40,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:54:40,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:41,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:47,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 15:54:51,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:54:51,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 15:54:54,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:55,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:55,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:57,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:59,145 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 15:54:59,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 15:55:01,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=67213.33333333333, ans=0.125 2023-09-28 15:55:04,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 15:55:05,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:07,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:09,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=67213.33333333333, ans=0.125 2023-09-28 15:55:10,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:55:10,673 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 15:55:10,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:12,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:55:17,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:55:18,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 15:55:18,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 15:55:20,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:55:20,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:55:20,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:23,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:55:23,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 15:55:26,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 15:55:27,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=67280.0, ans=0.1 2023-09-28 15:55:28,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:55:28,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=67280.0, ans=0.1 2023-09-28 15:55:33,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:55:33,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 15:55:33,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:55:33,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=67346.66666666667, ans=0.125 2023-09-28 15:55:35,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:36,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:55:38,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:38,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:55:43,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:43,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 15:55:44,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 15:55:46,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 15:55:48,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:55:48,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:49,538 INFO [train.py:1039] (2/4) Epoch 2, batch 4800, loss[loss=0.296, simple_loss=0.3405, pruned_loss=0.1257, over 24382.00 frames. ], tot_loss[loss=0.3055, simple_loss=0.3479, pruned_loss=0.1315, over 4710577.29 frames. ], batch size: 61, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:55:51,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 15:55:56,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:57,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:55:59,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=67413.33333333333, ans=0.025 2023-09-28 15:56:01,840 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.813e+02 3.481e+02 4.018e+02 6.093e+02, threshold=6.961e+02, percent-clipped=0.0 2023-09-28 15:56:03,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:56:05,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:05,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:07,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 15:56:08,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:56:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:56:09,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=67480.0, ans=15.0 2023-09-28 15:56:11,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:56:15,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:15,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=67480.0, ans=0.125 2023-09-28 15:56:18,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:18,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:56:20,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:20,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:56:20,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:21,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:24,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:27,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:56:32,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:56:33,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:33,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=67546.66666666667, ans=0.125 2023-09-28 15:56:34,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 15:56:36,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 15:56:37,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:37,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:56:37,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:56:37,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:37,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:56:39,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:56:41,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:46,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:56:48,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:50,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:56:55,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 15:56:55,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:57,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:57,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:56:58,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:02,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:57:04,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:57:04,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:04,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:57:05,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:57:05,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:57:10,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:10,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:10,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:57:11,582 INFO [train.py:1039] (2/4) Epoch 2, batch 4850, loss[loss=0.2921, simple_loss=0.3438, pruned_loss=0.1202, over 24359.00 frames. ], tot_loss[loss=0.3051, simple_loss=0.3477, pruned_loss=0.1312, over 4715734.71 frames. ], batch size: 61, lr: 3.40e-02, grad_scale: 32.0 2023-09-28 15:57:11,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 15:57:14,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 15:57:14,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:14,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:15,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:15,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:18,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:24,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=67746.66666666667, ans=0.125 2023-09-28 15:57:26,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 15:57:27,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:32,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=67813.33333333333, ans=0.2 2023-09-28 15:57:33,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:33,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:57:33,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:40,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:40,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:57:41,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:57:41,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 15:57:46,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:46,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=67880.0, ans=0.0 2023-09-28 15:57:47,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:57:47,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:57:49,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:57:49,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 15:57:52,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:52,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:56,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:56,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 15:57:56,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 15:57:57,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:58:00,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=67946.66666666667, ans=0.125 2023-09-28 15:58:02,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=67946.66666666667, ans=0.2 2023-09-28 15:58:06,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:58:06,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 15:58:08,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:58:08,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:58:11,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:58:14,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 15:58:14,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:16,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 15:58:16,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:16,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:18,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 15:58:25,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:30,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:58:30,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:34,390 INFO [train.py:1039] (2/4) Epoch 2, batch 4900, loss[loss=0.2916, simple_loss=0.3284, pruned_loss=0.1274, over 23715.00 frames. ], tot_loss[loss=0.3042, simple_loss=0.3461, pruned_loss=0.1312, over 4722673.71 frames. ], batch size: 149, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:58:37,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 15:58:37,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:58:42,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:42,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:43,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:58:46,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 15:58:47,477 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.052e+02 2.694e+02 3.057e+02 3.718e+02 7.972e+02, threshold=6.114e+02, percent-clipped=1.0 2023-09-28 15:58:50,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 15:58:53,329 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.83 vs. limit=15.0 2023-09-28 15:58:54,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 15:58:55,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 15:58:57,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:58:57,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:57,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:58:57,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:57,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:58:59,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 15:59:03,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 15:59:03,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:59:05,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:59:06,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:59:09,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:59:09,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:10,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:10,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 15:59:11,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.37 vs. limit=15.0 2023-09-28 15:59:12,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:59:14,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:59:14,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 15:59:14,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 15:59:17,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 15:59:19,322 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:59:20,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:59:21,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:59:22,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:59:23,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:23,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:59:25,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:59:25,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 15:59:26,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:28,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:59:31,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:59:35,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 15:59:35,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:59:37,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:59:38,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 15:59:40,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=68346.66666666667, ans=0.125 2023-09-28 15:59:44,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:46,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:59:46,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=68346.66666666667, ans=0.1 2023-09-28 15:59:47,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 15:59:47,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:59:47,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:59:50,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:53,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=68346.66666666667, ans=0.0 2023-09-28 15:59:54,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:59:54,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:59:54,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:54,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:59:55,941 INFO [train.py:1039] (2/4) Epoch 2, batch 4950, loss[loss=0.2867, simple_loss=0.3439, pruned_loss=0.1147, over 24534.00 frames. ], tot_loss[loss=0.3027, simple_loss=0.3442, pruned_loss=0.1306, over 4707123.92 frames. ], batch size: 71, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:59:57,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:00:00,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:00,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:00:03,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 16:00:03,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 16:00:03,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:00:05,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 16:00:05,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:05,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:00:05,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=68413.33333333333, ans=0.1 2023-09-28 16:00:05,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=68413.33333333333, ans=0.1 2023-09-28 16:00:06,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:00:08,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:09,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:11,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:00:13,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:00:15,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:16,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:16,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=68480.0, ans=0.125 2023-09-28 16:00:18,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:00:19,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:00:25,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:26,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:00:27,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=68546.66666666667, ans=0.1 2023-09-28 16:00:28,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:29,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:31,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:00:31,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 16:00:33,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 16:00:36,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:37,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:00:37,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:00:37,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:00:38,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:00:39,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:00:41,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:45,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:00:47,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:00:48,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:50,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:50,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 16:00:50,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=68613.33333333333, ans=0.125 2023-09-28 16:00:51,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:00:53,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:00:58,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:01:00,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:01:00,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=68680.0, ans=0.1 2023-09-28 16:01:01,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:01:01,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:01,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:01:01,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=68680.0, ans=0.1 2023-09-28 16:01:03,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:01:04,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:01:05,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:01:06,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:01:07,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 16:01:10,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:15,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 16:01:15,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:01:15,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=68746.66666666667, ans=0.125 2023-09-28 16:01:15,936 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.85 vs. limit=15.0 2023-09-28 16:01:16,658 INFO [train.py:1039] (2/4) Epoch 2, batch 5000, loss[loss=0.3145, simple_loss=0.3637, pruned_loss=0.1327, over 24064.00 frames. ], tot_loss[loss=0.3025, simple_loss=0.3439, pruned_loss=0.1305, over 4706569.64 frames. ], batch size: 80, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:01:21,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=68746.66666666667, ans=0.1 2023-09-28 16:01:22,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:22,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:24,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 16:01:26,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 16:01:27,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:01:28,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=68746.66666666667, ans=0.125 2023-09-28 16:01:29,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=68746.66666666667, ans=10.0 2023-09-28 16:01:30,659 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.879e+02 2.855e+02 3.346e+02 4.050e+02 6.399e+02, threshold=6.691e+02, percent-clipped=1.0 2023-09-28 16:01:30,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 16:01:30,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:01:31,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:01:33,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 16:01:33,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:34,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:01:35,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 16:01:35,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:37,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:01:37,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=68813.33333333333, ans=0.125 2023-09-28 16:01:38,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 16:01:38,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 16:01:38,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:01:40,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 16:01:40,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:01:40,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:40,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=68813.33333333333, ans=0.125 2023-09-28 16:01:41,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:01:41,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 16:01:41,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 16:01:44,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 16:01:44,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:45,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:46,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 16:01:48,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:51,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:52,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:54,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:01:55,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=68880.0, ans=0.0 2023-09-28 16:01:56,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 16:01:58,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:01:58,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:02:03,463 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 16:02:08,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:02:10,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:02:10,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:10,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=68946.66666666667, ans=0.0 2023-09-28 16:02:13,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 16:02:13,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:02:13,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:15,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:16,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 16:02:17,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:19,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:21,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:27,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 16:02:31,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:39,040 INFO [train.py:1039] (2/4) Epoch 2, batch 5050, loss[loss=0.3604, simple_loss=0.3769, pruned_loss=0.1719, over 23432.00 frames. ], tot_loss[loss=0.3017, simple_loss=0.3444, pruned_loss=0.1295, over 4720351.74 frames. ], batch size: 285, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:02:41,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:41,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:42,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:02:42,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:42,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:02:42,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:02:43,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 16:02:47,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:02:49,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=69080.0, ans=0.2 2023-09-28 16:02:51,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:53,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:02:53,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 16:02:53,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:55,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:56,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:02:58,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:02:58,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:03:09,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 16:03:10,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:03:10,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:11,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 16:03:11,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:13,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:14,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:03:14,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:03:14,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 16:03:16,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 16:03:17,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:19,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:19,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=69213.33333333333, ans=0.0 2023-09-28 16:03:23,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:23,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 16:03:24,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:25,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=69213.33333333333, ans=0.0 2023-09-28 16:03:28,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 16:03:28,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:03:28,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:03:30,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:31,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:31,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:03:35,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:03:35,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:35,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:03:35,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:03:35,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 16:03:37,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:03:39,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:43,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:43,358 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 16:03:43,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:03:44,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:03:46,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:46,311 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 16:03:50,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:50,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 16:03:50,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:51,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=15.0 2023-09-28 16:03:53,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:55,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:55,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 16:03:56,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 16:03:57,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=69346.66666666667, ans=10.0 2023-09-28 16:04:00,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:00,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:00,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:04:01,410 INFO [train.py:1039] (2/4) Epoch 2, batch 5100, loss[loss=0.3008, simple_loss=0.3333, pruned_loss=0.1341, over 23345.00 frames. ], tot_loss[loss=0.3043, simple_loss=0.3461, pruned_loss=0.1312, over 4710617.53 frames. ], batch size: 119, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:04:03,182 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 16:04:04,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:04:05,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=69413.33333333333, ans=0.125 2023-09-28 16:04:10,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 16:04:10,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 16:04:10,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:13,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:04:15,245 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.986e+02 2.824e+02 3.084e+02 3.697e+02 6.472e+02, threshold=6.168e+02, percent-clipped=0.0 2023-09-28 16:04:16,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=15.0 2023-09-28 16:04:17,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:04:17,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 16:04:17,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 16:04:24,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:04:24,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:04:27,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:32,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 16:04:32,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:32,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:04:32,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 16:04:35,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 16:04:39,807 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 16:04:39,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:41,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 16:04:41,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 16:04:46,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:53,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=69613.33333333333, ans=0.0 2023-09-28 16:04:55,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:04:55,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=69613.33333333333, ans=0.125 2023-09-28 16:04:59,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 16:04:59,771 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 16:04:59,793 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 16:05:01,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 16:05:01,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:05:04,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 16:05:07,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 16:05:10,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 16:05:12,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:05:15,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 16:05:17,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:05:18,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 16:05:23,791 INFO [train.py:1039] (2/4) Epoch 2, batch 5150, loss[loss=0.332, simple_loss=0.3551, pruned_loss=0.1544, over 23562.00 frames. ], tot_loss[loss=0.3048, simple_loss=0.3469, pruned_loss=0.1314, over 4717378.78 frames. ], batch size: 256, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:05:25,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:05:25,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:05:25,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:05:26,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:05:26,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:05:27,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:05:30,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 16:05:30,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 16:05:30,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 16:05:30,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:05:30,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 16:05:31,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:33,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:05:35,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:35,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=69746.66666666667, ans=0.0 2023-09-28 16:05:37,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:39,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.01 vs. limit=15.0 2023-09-28 16:05:41,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:05:41,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 16:05:44,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:45,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:05:47,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:05:47,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:05:47,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:05:48,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.45 vs. limit=15.0 2023-09-28 16:05:48,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:05:48,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:05:48,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 16:05:50,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:05:52,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:05:53,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:05:54,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 16:05:55,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:06:02,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:06:03,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 16:06:05,329 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.01 vs. limit=15.0 2023-09-28 16:06:06,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:12,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:13,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:13,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=69946.66666666667, ans=0.125 2023-09-28 16:06:16,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:18,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:20,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 16:06:25,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:06:26,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:06:27,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:06:30,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:30,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:32,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 16:06:37,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:39,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:06:42,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:42,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:06:43,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:06:43,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:06:43,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:06:43,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:06:45,220 INFO [train.py:1039] (2/4) Epoch 2, batch 5200, loss[loss=0.288, simple_loss=0.3389, pruned_loss=0.1185, over 24462.00 frames. ], tot_loss[loss=0.304, simple_loss=0.3469, pruned_loss=0.1306, over 4731114.31 frames. ], batch size: 63, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:06:47,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=70080.0, ans=0.125 2023-09-28 16:06:48,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:06:48,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:06:53,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:53,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=70080.0, ans=0.0 2023-09-28 16:06:55,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 16:06:57,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:06:58,450 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.056e+02 2.942e+02 3.378e+02 4.176e+02 6.037e+02, threshold=6.756e+02, percent-clipped=0.0 2023-09-28 16:06:58,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:00,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:02,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:07:02,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:04,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 16:07:07,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:07:08,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:12,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 16:07:14,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:07:14,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:07:16,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 16:07:17,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 16:07:20,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 16:07:22,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:22,254 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 16:07:22,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:23,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:23,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:07:25,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 16:07:25,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:07:27,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:32,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 16:07:32,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 16:07:32,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 16:07:35,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 16:07:37,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:07:41,419 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.01 vs. limit=10.0 2023-09-28 16:07:43,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:07:44,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:07:45,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 16:07:47,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:47,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:07:47,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:47,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:07:51,295 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.97 vs. limit=10.0 2023-09-28 16:07:52,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:07:53,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:07:55,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:58,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:07:58,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:03,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:04,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 16:08:06,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:08:06,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:08:08,684 INFO [train.py:1039] (2/4) Epoch 2, batch 5250, loss[loss=0.3019, simple_loss=0.3519, pruned_loss=0.1259, over 23405.00 frames. ], tot_loss[loss=0.3037, simple_loss=0.3461, pruned_loss=0.1306, over 4716466.34 frames. ], batch size: 93, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:08:08,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:08,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:08:10,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:08:12,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:08:16,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:16,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:08:18,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:08:25,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:26,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:08:28,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:08:28,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=70480.0, ans=0.0 2023-09-28 16:08:30,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:08:33,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 16:08:33,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:34,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:37,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=70480.0, ans=0.125 2023-09-28 16:08:41,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=70546.66666666667, ans=0.0 2023-09-28 16:08:50,427 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.48 vs. limit=15.0 2023-09-28 16:09:21,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=70746.66666666667, ans=0.125 2023-09-28 16:09:22,858 INFO [train.py:1039] (2/4) Epoch 2, batch 5300, loss[loss=0.3091, simple_loss=0.3426, pruned_loss=0.1378, over 23396.00 frames. ], tot_loss[loss=0.3031, simple_loss=0.3453, pruned_loss=0.1305, over 4700201.88 frames. ], batch size: 105, lr: 3.35e-02, grad_scale: 32.0 2023-09-28 16:09:31,188 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.54 vs. limit=22.5 2023-09-28 16:09:34,308 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.707e+02 3.072e+02 3.599e+02 7.324e+02, threshold=6.143e+02, percent-clipped=3.0 2023-09-28 16:09:36,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=70813.33333333333, ans=0.125 2023-09-28 16:09:37,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:09:37,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 16:09:37,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 16:09:37,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:38,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:38,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:38,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:38,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:38,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:09:38,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:38,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:09:39,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:09:39,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 16:09:39,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 16:09:39,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 16:09:39,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:09:39,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 16:09:39,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 16:09:40,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:41,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:41,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:41,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:41,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:09:41,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:41,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:42,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:42,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:42,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:42,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:09:42,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:42,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:09:43,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 16:09:43,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:43,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:43,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 16:09:43,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 16:09:44,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:09:44,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:09:44,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 16:09:45,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 16:09:45,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:45,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:09:45,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:46,095 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 16:09:46,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 16:09:46,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:09:46,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:46,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 16:09:46,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 16:09:46,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 16:09:46,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:56,583 INFO [train.py:1039] (2/4) Epoch 3, batch 0, loss[loss=0.332, simple_loss=0.3608, pruned_loss=0.1516, over 22879.00 frames. ], tot_loss[loss=0.332, simple_loss=0.3608, pruned_loss=0.1516, over 22879.00 frames. ], batch size: 322, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:09:56,584 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 16:10:11,591 INFO [train.py:1071] (2/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3654, pruned_loss=0.2147, over 1125622.00 frames. 2023-09-28 16:10:11,592 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 16:10:14,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 16:10:16,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:10:17,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:10:23,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:23,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:10:23,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:23,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 16:10:26,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 16:10:29,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:30,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:35,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:36,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:37,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=70893.33333333333, ans=0.2 2023-09-28 16:10:38,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:10:38,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:38,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=70893.33333333333, ans=0.125 2023-09-28 16:10:39,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 16:10:44,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:48,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=70960.0, ans=0.125 2023-09-28 16:10:51,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:10:51,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:53,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 16:10:55,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=70960.0, ans=0.04949747468305833 2023-09-28 16:10:58,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:10:58,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:10:58,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:02,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:11:05,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:05,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=71026.66666666667, ans=0.0 2023-09-28 16:11:09,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 16:11:10,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=71026.66666666667, ans=0.125 2023-09-28 16:11:12,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 16:11:13,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:13,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:15,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:11:15,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:11:17,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 16:11:19,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:21,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:24,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:11:30,011 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 16:11:30,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=71093.33333333333, ans=0.125 2023-09-28 16:11:31,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:11:32,923 INFO [train.py:1039] (2/4) Epoch 3, batch 50, loss[loss=0.2864, simple_loss=0.3291, pruned_loss=0.1219, over 21622.00 frames. ], tot_loss[loss=0.3037, simple_loss=0.348, pruned_loss=0.1296, over 1063266.26 frames. ], batch size: 47, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:11:33,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:35,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.54 vs. limit=15.0 2023-09-28 16:11:36,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:36,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 16:11:37,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:11:37,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:11:39,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:39,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:43,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:44,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=71160.0, ans=0.125 2023-09-28 16:11:45,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 16:11:45,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:48,482 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.53 vs. limit=15.0 2023-09-28 16:11:51,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:11:52,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 16:11:54,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 16:11:57,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:11:59,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:11:59,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:59,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=71226.66666666667, ans=0.125 2023-09-28 16:12:01,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:03,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:12:03,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:12:03,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:12:03,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=71226.66666666667, ans=0.0 2023-09-28 16:12:06,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=71293.33333333333, ans=0.2 2023-09-28 16:12:11,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:12,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:12,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:12:14,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 16:12:16,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:12:17,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:12:17,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 16:12:17,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:19,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 16:12:27,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:12:27,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:27,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:29,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:12:31,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:33,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 16:12:33,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 16:12:35,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:36,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:38,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:40,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:40,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 16:12:42,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 16:12:43,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:12:44,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:45,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.20 vs. limit=12.0 2023-09-28 16:12:46,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:12:47,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 16:12:47,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 16:12:49,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.184e+02 2.852e+02 3.312e+02 4.404e+02 9.515e+02, threshold=6.623e+02, percent-clipped=7.0 2023-09-28 16:12:49,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:49,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:51,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:12:51,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:12:54,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:12:55,750 INFO [train.py:1039] (2/4) Epoch 3, batch 100, loss[loss=0.3043, simple_loss=0.3609, pruned_loss=0.1239, over 24411.00 frames. ], tot_loss[loss=0.2987, simple_loss=0.3447, pruned_loss=0.1264, over 1883916.67 frames. ], batch size: 69, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:12:57,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:13:01,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:04,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 16:13:04,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:13:08,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:13:08,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:08,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:13:08,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:13:08,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:10,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 16:13:13,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.84 vs. limit=15.0 2023-09-28 16:13:15,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:13:15,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:16,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:16,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:21,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 16:13:22,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:23,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:24,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:13:26,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:13:30,736 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 16:13:30,761 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 16:13:32,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:13:32,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:13:36,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:13:38,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:40,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:45,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:47,212 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 16:13:49,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:13:54,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:13:54,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:13:56,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:00,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:03,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:05,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:14:08,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:10,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:10,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:10,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=71760.0, ans=0.1 2023-09-28 16:14:10,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=2.69 vs. limit=15.0 2023-09-28 16:14:11,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:14:11,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:11,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 16:14:11,870 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 16:14:11,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:13,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:14:15,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:15,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:15,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 16:14:15,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:14:15,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:14:16,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:16,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:18,161 INFO [train.py:1039] (2/4) Epoch 3, batch 150, loss[loss=0.3334, simple_loss=0.3837, pruned_loss=0.1415, over 24643.00 frames. ], tot_loss[loss=0.2985, simple_loss=0.3447, pruned_loss=0.1262, over 2529683.70 frames. ], batch size: 73, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:14:18,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:19,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:14:19,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:14:23,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:27,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:27,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:14:27,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:33,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:33,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:38,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:14:38,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:41,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 16:14:41,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 16:14:41,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 16:14:44,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:14:44,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:14:46,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:14:48,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:48,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:49,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:49,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:52,682 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 16:14:53,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=71960.0, ans=0.125 2023-09-28 16:14:54,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:59,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:05,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.33 vs. limit=15.0 2023-09-28 16:15:06,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:15:07,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 16:15:11,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:15:11,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:11,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:13,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:15:15,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:15:16,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:15:16,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:18,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 16:15:22,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:22,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:22,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:15:24,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:15:25,055 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.38 vs. limit=15.0 2023-09-28 16:15:25,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:26,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=72093.33333333333, ans=0.125 2023-09-28 16:15:27,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 16:15:29,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:15:31,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:15:33,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:34,796 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.108e+02 2.675e+02 3.139e+02 3.901e+02 5.670e+02, threshold=6.278e+02, percent-clipped=0.0 2023-09-28 16:15:37,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:15:37,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 16:15:37,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:38,557 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 16:15:38,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=72093.33333333333, ans=0.125 2023-09-28 16:15:42,125 INFO [train.py:1039] (2/4) Epoch 3, batch 200, loss[loss=0.2699, simple_loss=0.3108, pruned_loss=0.1145, over 19691.00 frames. ], tot_loss[loss=0.2996, simple_loss=0.3454, pruned_loss=0.1269, over 3018764.03 frames. ], batch size: 43, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:15:42,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:15:46,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:15:46,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:15:49,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 16:15:51,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:51,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:54,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 16:15:54,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:15:56,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:57,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:02,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:16:02,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:16:02,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:05,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=72226.66666666667, ans=0.1 2023-09-28 16:16:25,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:16:26,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:16:26,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:16:26,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:16:26,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:16:27,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:16:28,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:31,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:16:31,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:31,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:16:33,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 16:16:34,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:16:34,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:37,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:16:46,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:49,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=72426.66666666667, ans=0.0 2023-09-28 16:16:55,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:55,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:17:00,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:03,667 INFO [train.py:1039] (2/4) Epoch 3, batch 250, loss[loss=0.3825, simple_loss=0.3914, pruned_loss=0.1868, over 19778.00 frames. ], tot_loss[loss=0.2982, simple_loss=0.3442, pruned_loss=0.1261, over 3384860.90 frames. ], batch size: 388, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:17:03,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 16:17:04,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=72493.33333333333, ans=0.125 2023-09-28 16:17:05,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:05,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:17:05,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:06,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:17:07,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 16:17:07,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:17:08,514 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 16:17:10,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:11,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:17:13,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:13,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:14,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=72493.33333333333, ans=0.125 2023-09-28 16:17:15,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:17:16,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:18,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:17:24,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:17:26,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=72560.0, ans=0.1 2023-09-28 16:17:34,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:36,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:36,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:17:42,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:17:42,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:17:44,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:17:44,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=72626.66666666667, ans=0.05 2023-09-28 16:17:45,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:45,857 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:17:47,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:17:47,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:17:48,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:50,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:17:55,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 16:17:55,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:57,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:17:57,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:17:57,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:17:57,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:18:00,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:18:01,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:18:04,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:05,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:18:06,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:09,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:18:13,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:15,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=72760.0, ans=0.0 2023-09-28 16:18:16,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:18:19,837 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.635e+02 3.105e+02 3.716e+02 7.443e+02, threshold=6.210e+02, percent-clipped=1.0 2023-09-28 16:18:21,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=72760.0, ans=0.125 2023-09-28 16:18:22,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:25,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:18:26,512 INFO [train.py:1039] (2/4) Epoch 3, batch 300, loss[loss=0.2924, simple_loss=0.3496, pruned_loss=0.1176, over 24305.00 frames. ], tot_loss[loss=0.2956, simple_loss=0.3414, pruned_loss=0.1249, over 3672246.49 frames. ], batch size: 77, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:18:28,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 16:18:29,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:18:29,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:18:31,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 16:18:32,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:18:34,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:18:34,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 16:18:34,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=72826.66666666667, ans=0.0 2023-09-28 16:18:37,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:40,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:18:42,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:18:42,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 16:18:43,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:45,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:18:45,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 16:18:45,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:18:49,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=72893.33333333333, ans=0.125 2023-09-28 16:18:49,645 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.49 vs. limit=10.0 2023-09-28 16:18:50,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:18:54,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:18:54,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 16:18:59,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 16:19:00,541 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.42 vs. limit=6.0 2023-09-28 16:19:01,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:03,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:07,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:07,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 16:19:07,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:19:08,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.06 vs. limit=6.0 2023-09-28 16:19:08,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:19:10,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:19:12,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:15,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:19:15,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 16:19:16,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:19:18,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:18,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=73026.66666666667, ans=0.0 2023-09-28 16:19:19,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.72 vs. limit=15.0 2023-09-28 16:19:20,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 16:19:20,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:24,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:19:28,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:19:28,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 16:19:30,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=73026.66666666667, ans=0.1 2023-09-28 16:19:33,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:33,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:19:36,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:36,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=73093.33333333333, ans=0.1 2023-09-28 16:19:37,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:19:37,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 16:19:37,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:19:37,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:39,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 16:19:41,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:42,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:44,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:44,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:44,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:49,007 INFO [train.py:1039] (2/4) Epoch 3, batch 350, loss[loss=0.2928, simple_loss=0.3346, pruned_loss=0.1255, over 23244.00 frames. ], tot_loss[loss=0.2939, simple_loss=0.3382, pruned_loss=0.1248, over 3876809.85 frames. ], batch size: 119, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:19:49,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:19:49,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:19:50,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:58,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:20:01,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:01,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:04,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 16:20:06,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:06,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 16:20:10,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:11,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 16:20:11,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=73226.66666666667, ans=0.125 2023-09-28 16:20:13,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:14,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 16:20:15,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=73226.66666666667, ans=0.2 2023-09-28 16:20:16,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:20:18,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:19,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:20:21,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:21,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:22,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:20:24,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:20:24,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:31,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:20:31,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:20:32,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:20:34,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:35,923 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:20:39,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.63 vs. limit=15.0 2023-09-28 16:20:40,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 16:20:40,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:45,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:45,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:20:45,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:47,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 16:20:51,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:20:52,698 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 16:20:52,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 16:20:52,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:57,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:57,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 16:21:00,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:02,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:21:03,440 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.805e+02 2.765e+02 3.239e+02 3.985e+02 6.243e+02, threshold=6.477e+02, percent-clipped=2.0 2023-09-28 16:21:03,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:05,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:05,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:08,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:10,295 INFO [train.py:1039] (2/4) Epoch 3, batch 400, loss[loss=0.2901, simple_loss=0.352, pruned_loss=0.1141, over 24699.00 frames. ], tot_loss[loss=0.2945, simple_loss=0.3393, pruned_loss=0.1248, over 4072364.06 frames. ], batch size: 73, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:21:10,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:21:13,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:21:15,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 16:21:15,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:15,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:17,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:21:18,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:20,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:22,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:22,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 16:21:25,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 16:21:25,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:26,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 16:21:26,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:21:30,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:30,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 16:21:30,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:21:30,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:31,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:33,278 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 16:21:34,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 16:21:40,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:41,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:42,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 16:21:43,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 16:21:43,833 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.90 vs. limit=6.0 2023-09-28 16:21:46,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:21:49,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:21:56,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=73626.66666666667, ans=0.1 2023-09-28 16:21:57,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 16:21:58,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=73626.66666666667, ans=0.05 2023-09-28 16:22:00,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:22:03,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 16:22:03,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=73693.33333333333, ans=0.125 2023-09-28 16:22:06,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:22:07,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:22:07,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 16:22:11,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:22:11,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=73693.33333333333, ans=0.125 2023-09-28 16:22:14,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:22:16,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:22:19,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:19,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 16:22:19,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:22:20,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 16:22:22,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:22:22,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:22:24,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 16:22:26,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:22:26,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:22:28,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:22:29,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=6.91 vs. limit=15.0 2023-09-28 16:22:30,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 16:22:30,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:22:32,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:22:32,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:22:32,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 16:22:32,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:22:33,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:22:35,082 INFO [train.py:1039] (2/4) Epoch 3, batch 450, loss[loss=0.2812, simple_loss=0.3459, pruned_loss=0.1083, over 24672.00 frames. ], tot_loss[loss=0.2957, simple_loss=0.3404, pruned_loss=0.1255, over 4210128.96 frames. ], batch size: 68, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:22:36,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:22:46,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:46,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:22:46,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=73826.66666666667, ans=0.0 2023-09-28 16:22:48,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 16:22:49,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 16:22:52,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:22:56,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:59,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:05,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:06,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:06,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=73960.0, ans=0.0 2023-09-28 16:23:09,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 16:23:09,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 16:23:10,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=73960.0, ans=0.125 2023-09-28 16:23:11,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 16:23:11,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:12,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:14,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:23:14,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=73960.0, ans=0.2 2023-09-28 16:23:16,033 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 16:23:16,047 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 16:23:16,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:23:18,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:23:19,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:23:22,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:23:22,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:23:24,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:23:24,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 16:23:26,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:29,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:23:29,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:23:32,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 16:23:36,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:23:36,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=74026.66666666667, ans=0.125 2023-09-28 16:23:38,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 16:23:40,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 16:23:41,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:45,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=74093.33333333333, ans=0.125 2023-09-28 16:23:46,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:23:47,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:23:48,484 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.10 vs. limit=10.0 2023-09-28 16:23:48,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.33 vs. limit=10.0 2023-09-28 16:23:49,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:23:49,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=74093.33333333333, ans=0.0 2023-09-28 16:23:51,032 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 16:23:52,494 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.112e+02 2.606e+02 2.993e+02 3.540e+02 4.868e+02, threshold=5.986e+02, percent-clipped=0.0 2023-09-28 16:23:54,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:54,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:23:54,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:54,769 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 16:23:57,556 INFO [train.py:1039] (2/4) Epoch 3, batch 500, loss[loss=0.2843, simple_loss=0.3305, pruned_loss=0.1191, over 23473.00 frames. ], tot_loss[loss=0.2958, simple_loss=0.3409, pruned_loss=0.1253, over 4306629.95 frames. ], batch size: 93, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:23:57,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 16:23:57,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:00,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:24:05,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:24:07,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:24:10,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:24:10,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:24:11,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:21,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:22,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:24:22,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:24:22,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:24,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 16:24:24,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:24:27,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:24:28,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:24:28,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:24:30,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:30,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 16:24:35,590 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 16:24:37,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:24:38,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:38,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:24:44,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 16:24:47,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:24:47,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:24:50,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=74360.0, ans=0.0 2023-09-28 16:24:53,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:54,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:56,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=74360.0, ans=0.2 2023-09-28 16:24:59,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:04,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 16:25:04,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:04,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:10,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 16:25:10,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:25:12,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:16,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 16:25:18,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 16:25:18,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:18,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 16:25:20,333 INFO [train.py:1039] (2/4) Epoch 3, batch 550, loss[loss=0.3028, simple_loss=0.334, pruned_loss=0.1358, over 23878.00 frames. ], tot_loss[loss=0.2966, simple_loss=0.3421, pruned_loss=0.1256, over 4400481.71 frames. ], batch size: 195, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:25:20,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:25:20,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:21,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:25:25,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:25:28,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:30,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 16:25:30,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:25:34,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:36,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:37,093 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.58 vs. limit=22.5 2023-09-28 16:25:39,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:25:40,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:44,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 16:25:44,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 16:25:47,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:25:52,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:25:52,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:25:54,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:25:58,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:58,237 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 16:25:58,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=74626.66666666667, ans=0.04949747468305833 2023-09-28 16:26:00,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:26:01,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:26:05,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:26:06,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:26:06,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:26:08,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:09,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 16:26:10,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=74693.33333333333, ans=0.125 2023-09-28 16:26:11,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 16:26:12,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:12,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:26:14,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:26:14,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:26:17,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:26:18,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:26:22,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:26:22,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:24,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 16:26:24,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:26:24,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=74760.0, ans=0.125 2023-09-28 16:26:27,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:28,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:26:30,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:32,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:26:32,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:26:36,869 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.890e+02 2.622e+02 3.187e+02 4.101e+02 6.995e+02, threshold=6.373e+02, percent-clipped=4.0 2023-09-28 16:26:38,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 16:26:41,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 16:26:42,503 INFO [train.py:1039] (2/4) Epoch 3, batch 600, loss[loss=0.2895, simple_loss=0.3198, pruned_loss=0.1295, over 23837.00 frames. ], tot_loss[loss=0.2954, simple_loss=0.3415, pruned_loss=0.1246, over 4481520.49 frames. ], batch size: 195, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:26:42,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:26:44,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:26:44,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:50,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:26:51,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:26:53,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 16:26:56,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:26:58,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:26:58,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:02,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 16:27:02,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:27:02,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=74893.33333333333, ans=0.1 2023-09-28 16:27:08,633 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.67 vs. limit=15.0 2023-09-28 16:27:11,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 16:27:14,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:27:14,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:14,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:27:19,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:27:21,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:27:21,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:27,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:27:33,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:33,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:27:33,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:42,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 16:27:48,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:27:48,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:27:53,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 16:27:53,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:27:56,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 16:27:56,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:27:58,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:28:03,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 16:28:04,693 INFO [train.py:1039] (2/4) Epoch 3, batch 650, loss[loss=0.2952, simple_loss=0.3565, pruned_loss=0.117, over 24341.00 frames. ], tot_loss[loss=0.2938, simple_loss=0.3394, pruned_loss=0.1241, over 4528012.84 frames. ], batch size: 77, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:28:04,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:28:06,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:08,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=75160.0, ans=0.2 2023-09-28 16:28:09,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:28:11,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:13,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 16:28:15,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:28:20,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.15 vs. limit=15.0 2023-09-28 16:28:20,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:28:20,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:24,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:24,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=75226.66666666667, ans=0.1 2023-09-28 16:28:27,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 16:28:29,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=75226.66666666667, ans=0.05 2023-09-28 16:28:30,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:28:31,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=75226.66666666667, ans=0.0 2023-09-28 16:28:32,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:32,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=75226.66666666667, ans=0.0 2023-09-28 16:28:35,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:28:36,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:28:38,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:39,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:39,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:28:41,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:43,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:28:45,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:28:45,341 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 16:28:45,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:45,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:28:48,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=75293.33333333333, ans=0.0 2023-09-28 16:28:50,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:50,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=75293.33333333333, ans=0.125 2023-09-28 16:28:51,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:53,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:28:53,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:28:55,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 16:28:56,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:28:56,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:58,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:28:58,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:59,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:29:01,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 16:29:03,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 16:29:03,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:03,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:29:03,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:29:03,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=75360.0, ans=0.0 2023-09-28 16:29:03,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=75360.0, ans=0.125 2023-09-28 16:29:04,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:29:04,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:29:11,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:11,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:12,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:29:15,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:15,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:29:16,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:23,199 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.199e+02 2.685e+02 3.190e+02 3.569e+02 4.758e+02, threshold=6.380e+02, percent-clipped=0.0 2023-09-28 16:29:23,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:29:23,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:23,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=75426.66666666667, ans=0.2 2023-09-28 16:29:24,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:24,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:27,705 INFO [train.py:1039] (2/4) Epoch 3, batch 700, loss[loss=0.3, simple_loss=0.3326, pruned_loss=0.1337, over 23424.00 frames. ], tot_loss[loss=0.293, simple_loss=0.3381, pruned_loss=0.1239, over 4554756.53 frames. ], batch size: 285, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:29:28,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=75493.33333333333, ans=0.0 2023-09-28 16:29:29,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 16:29:31,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 16:29:34,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 16:29:34,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:36,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:29:39,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 16:29:44,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:46,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:29:47,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:49,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:29:50,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:50,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=75560.0, ans=0.125 2023-09-28 16:29:54,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:56,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:29:57,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:29:59,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 16:30:03,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 16:30:06,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:30:06,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:30:08,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:30:11,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:30:11,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 16:30:16,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:17,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:30:17,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 16:30:18,805 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.58 vs. limit=12.0 2023-09-28 16:30:22,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:30:23,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:27,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:30:32,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:30:34,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 16:30:36,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 16:30:36,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 16:30:40,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:43,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:30:44,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:30:44,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:44,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 16:30:50,479 INFO [train.py:1039] (2/4) Epoch 3, batch 750, loss[loss=0.2793, simple_loss=0.3444, pruned_loss=0.1071, over 24479.00 frames. ], tot_loss[loss=0.2936, simple_loss=0.3383, pruned_loss=0.1244, over 4583925.32 frames. ], batch size: 66, lr: 3.11e-02, grad_scale: 16.0 2023-09-28 16:30:50,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 16:30:50,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 16:30:50,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 16:30:50,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=75826.66666666667, ans=0.07 2023-09-28 16:30:52,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 16:30:52,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 16:30:53,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:30:55,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 16:30:56,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:58,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:31:00,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:01,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.29 vs. limit=22.5 2023-09-28 16:31:01,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:01,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:31:02,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:05,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:31:06,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:31:08,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:31:09,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=75893.33333333333, ans=0.0 2023-09-28 16:31:12,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:12,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:14,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 16:31:14,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=75893.33333333333, ans=0.125 2023-09-28 16:31:15,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:31:15,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:19,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:20,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:31:21,266 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.24 vs. limit=10.0 2023-09-28 16:31:22,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 16:31:22,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:31:26,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 16:31:26,633 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 16:31:28,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 16:31:28,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:31:28,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:31:31,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:31:38,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:31:38,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:38,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:31:38,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=76026.66666666667, ans=0.0 2023-09-28 16:31:41,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:42,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:44,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 16:31:44,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:31:47,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 16:31:49,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:31:49,685 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:31:52,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:31:53,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 16:31:53,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:56,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:31:58,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:31:59,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:00,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=76093.33333333333, ans=0.0 2023-09-28 16:32:01,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:32:04,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 16:32:04,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:06,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:07,687 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.054e+02 2.659e+02 2.971e+02 3.538e+02 5.180e+02, threshold=5.942e+02, percent-clipped=0.0 2023-09-28 16:32:07,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:07,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:08,073 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:32:10,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:10,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:32:11,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=76160.0, ans=0.1 2023-09-28 16:32:12,826 INFO [train.py:1039] (2/4) Epoch 3, batch 800, loss[loss=0.3107, simple_loss=0.3483, pruned_loss=0.1366, over 23559.00 frames. ], tot_loss[loss=0.2943, simple_loss=0.3396, pruned_loss=0.1245, over 4618244.77 frames. ], batch size: 256, lr: 3.11e-02, grad_scale: 32.0 2023-09-28 16:32:23,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:23,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:25,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:25,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:26,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:26,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:28,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:31,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:33,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:32:36,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 16:32:37,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:39,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:39,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:32:39,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:39,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 16:32:41,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:41,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 16:32:45,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:48,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:49,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=76293.33333333333, ans=0.0 2023-09-28 16:32:50,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:50,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:53,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.31 vs. limit=12.0 2023-09-28 16:32:55,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:56,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:33:00,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:00,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:33:00,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 16:33:02,934 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 16:33:02,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 16:33:05,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:33:05,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:06,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:06,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:06,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=76360.0, ans=0.2 2023-09-28 16:33:12,654 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 16:33:12,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 16:33:14,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:33:15,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:33:20,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:33:23,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:33:25,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 16:33:25,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:33:30,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 16:33:33,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:35,168 INFO [train.py:1039] (2/4) Epoch 3, batch 850, loss[loss=0.2899, simple_loss=0.357, pruned_loss=0.1114, over 24657.00 frames. ], tot_loss[loss=0.2947, simple_loss=0.3409, pruned_loss=0.1243, over 4653192.74 frames. ], batch size: 68, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:33:37,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:33:38,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 16:33:38,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:33:40,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:42,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 16:33:43,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:43,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:33:45,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:46,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:33:48,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:50,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 16:33:50,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 16:33:50,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 16:33:51,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:53,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:54,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:54,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:54,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:33:58,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:58,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:00,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 16:34:04,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 16:34:05,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:34:09,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 16:34:12,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 16:34:13,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 16:34:15,968 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 16:34:15,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:15,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:34:16,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:34:19,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 16:34:23,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:25,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:25,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:34:25,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:34:28,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:34:29,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:34:29,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 16:34:35,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:34:35,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:35,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:34:35,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:34:37,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:40,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:43,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:34:44,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:34:44,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=76760.0, ans=0.125 2023-09-28 16:34:44,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=12.0 2023-09-28 16:34:46,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:34:47,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:34:53,428 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.514e+02 2.970e+02 3.562e+02 5.095e+02, threshold=5.941e+02, percent-clipped=0.0 2023-09-28 16:34:55,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:34:57,901 INFO [train.py:1039] (2/4) Epoch 3, batch 900, loss[loss=0.3192, simple_loss=0.352, pruned_loss=0.1432, over 23826.00 frames. ], tot_loss[loss=0.2977, simple_loss=0.3436, pruned_loss=0.1259, over 4653455.73 frames. ], batch size: 212, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:34:57,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:58,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 16:34:58,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:34:59,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:35:01,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 16:35:07,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:35:10,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:12,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 16:35:17,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:35:17,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 16:35:18,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:35:18,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=76893.33333333333, ans=0.0 2023-09-28 16:35:19,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:35:19,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:19,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:35:20,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:35:24,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=76893.33333333333, ans=0.0 2023-09-28 16:35:31,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:35:31,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:31,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:35:34,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:38,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=76960.0, ans=0.125 2023-09-28 16:35:39,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 16:35:41,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:35:46,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:35:46,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:35:48,197 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 16:35:48,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 16:35:55,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:35:55,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:35:55,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:35:55,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=77026.66666666667, ans=0.125 2023-09-28 16:36:01,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:01,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:03,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 16:36:03,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:36:03,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=77026.66666666667, ans=0.125 2023-09-28 16:36:07,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 16:36:09,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:36:09,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:11,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:36:11,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:16,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 16:36:16,460 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 16:36:19,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:36:19,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 16:36:22,410 INFO [train.py:1039] (2/4) Epoch 3, batch 950, loss[loss=0.2759, simple_loss=0.3222, pruned_loss=0.1148, over 17205.00 frames. ], tot_loss[loss=0.2975, simple_loss=0.3431, pruned_loss=0.1259, over 4653553.14 frames. ], batch size: 37, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:36:22,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:27,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 16:36:29,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.32 vs. limit=15.0 2023-09-28 16:36:31,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:32,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:32,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:34,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:36:36,650 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 16:36:39,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:41,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:36:41,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=77226.66666666667, ans=0.0 2023-09-28 16:36:42,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:42,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:36:42,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 16:36:44,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:36:47,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:47,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 16:36:49,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:52,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:52,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:54,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 16:36:57,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:37:00,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:37:03,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:37:07,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:07,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:37:10,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 16:37:11,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=77360.0, ans=0.0 2023-09-28 16:37:15,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:37:15,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:37:15,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:15,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:15,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:37:21,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 16:37:21,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:37:25,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:26,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:26,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 16:37:26,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:26,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:37:26,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 16:37:33,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:37:35,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:38,275 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.75 vs. limit=22.5 2023-09-28 16:37:40,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.993e+02 2.741e+02 3.253e+02 3.972e+02 7.741e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 16:37:40,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:37:43,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 16:37:43,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 16:37:45,317 INFO [train.py:1039] (2/4) Epoch 3, batch 1000, loss[loss=0.2907, simple_loss=0.3354, pruned_loss=0.123, over 23263.00 frames. ], tot_loss[loss=0.296, simple_loss=0.3415, pruned_loss=0.1253, over 4656969.55 frames. ], batch size: 105, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:37:47,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:48,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=77493.33333333333, ans=0.1 2023-09-28 16:37:50,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 16:37:50,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:53,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:37:56,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 16:37:56,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 16:37:57,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=77493.33333333333, ans=0.125 2023-09-28 16:38:01,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:02,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:38:02,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:06,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=77560.0, ans=0.125 2023-09-28 16:38:07,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 16:38:12,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 16:38:13,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 16:38:13,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:15,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 16:38:18,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 16:38:18,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 16:38:20,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:20,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:25,786 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.32 vs. limit=12.0 2023-09-28 16:38:28,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:29,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:38:29,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:29,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:29,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 16:38:29,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:31,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:38:33,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:33,593 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 16:38:36,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 16:38:38,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 16:38:38,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=77693.33333333333, ans=0.125 2023-09-28 16:38:39,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 16:38:43,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:38:50,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:50,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:38:50,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:51,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:38:53,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 16:38:54,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:38:54,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 16:38:55,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 16:38:57,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:38:57,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:39:00,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:39:02,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:39:04,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:08,074 INFO [train.py:1039] (2/4) Epoch 3, batch 1050, loss[loss=0.2733, simple_loss=0.3386, pruned_loss=0.1041, over 24664.00 frames. ], tot_loss[loss=0.2936, simple_loss=0.3388, pruned_loss=0.1243, over 4652107.07 frames. ], batch size: 68, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:39:09,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:39:09,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:39:12,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:39:13,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:13,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=77826.66666666667, ans=0.1 2023-09-28 16:39:14,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:16,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:39:19,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:39:22,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:39:22,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:39:22,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:39:24,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:39:25,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 16:39:26,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:26,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 16:39:29,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:39:29,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 16:39:29,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:39:35,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:36,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:39:36,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:40,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 16:39:40,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 16:39:40,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:45,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 16:39:47,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 16:39:48,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:52,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:39:53,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:39:53,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:39:55,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:39:58,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:40:03,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 16:40:03,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 16:40:03,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 16:40:05,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:05,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:40:06,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=78026.66666666667, ans=0.125 2023-09-28 16:40:08,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 16:40:14,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:40:15,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:15,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:15,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:15,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:19,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:19,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 16:40:21,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:21,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 16:40:22,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 16:40:22,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:40:22,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=78093.33333333333, ans=0.1 2023-09-28 16:40:25,622 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.910e+02 2.729e+02 3.108e+02 3.500e+02 5.269e+02, threshold=6.215e+02, percent-clipped=0.0 2023-09-28 16:40:25,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:40:31,268 INFO [train.py:1039] (2/4) Epoch 3, batch 1100, loss[loss=0.334, simple_loss=0.3402, pruned_loss=0.1639, over 19477.00 frames. ], tot_loss[loss=0.292, simple_loss=0.3377, pruned_loss=0.1231, over 4648817.22 frames. ], batch size: 388, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:40:31,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:40:37,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:40:38,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:40:38,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:40,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 16:40:40,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:40:45,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:40:48,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.83 vs. limit=6.0 2023-09-28 16:40:49,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:40:53,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:40:54,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 16:40:55,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:40:57,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:57,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:58,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=78226.66666666667, ans=0.2 2023-09-28 16:40:59,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:41:01,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:41:05,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:41:09,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 16:41:09,270 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 16:41:10,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:12,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:13,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:41:13,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:41:16,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 16:41:16,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:41:16,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:41:16,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:41:17,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:17,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=78293.33333333333, ans=0.0 2023-09-28 16:41:17,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=78293.33333333333, ans=0.0 2023-09-28 16:41:19,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 16:41:23,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:41:23,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 16:41:27,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:41:32,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:41:35,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 16:41:36,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:41:38,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:40,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:41:40,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:42,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=78426.66666666667, ans=0.0 2023-09-28 16:41:43,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 16:41:43,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:41:43,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:45,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 16:41:45,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:41:45,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 16:41:47,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:41:47,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:41:49,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:41:53,573 INFO [train.py:1039] (2/4) Epoch 3, batch 1150, loss[loss=0.284, simple_loss=0.3482, pruned_loss=0.1099, over 24341.00 frames. ], tot_loss[loss=0.2917, simple_loss=0.3384, pruned_loss=0.1226, over 4679002.82 frames. ], batch size: 77, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:41:55,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:41:58,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:42:00,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:00,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:42:01,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 16:42:01,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:04,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 16:42:04,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:04,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:42:13,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 16:42:16,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:19,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:21,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:21,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 16:42:21,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:42:21,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:24,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 16:42:26,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:28,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:38,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:46,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:46,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 16:42:48,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:48,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:50,480 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.39 vs. limit=12.0 2023-09-28 16:42:54,013 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.75 vs. limit=22.5 2023-09-28 16:42:54,776 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 16:42:56,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:04,587 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 16:43:07,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:07,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=78760.0, ans=0.125 2023-09-28 16:43:09,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:43:09,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:43:11,193 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.090e+02 2.632e+02 2.933e+02 3.650e+02 8.073e+02, threshold=5.867e+02, percent-clipped=1.0 2023-09-28 16:43:11,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:43:14,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:14,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=78826.66666666667, ans=0.2 2023-09-28 16:43:15,737 INFO [train.py:1039] (2/4) Epoch 3, batch 1200, loss[loss=0.2552, simple_loss=0.303, pruned_loss=0.1037, over 24408.00 frames. ], tot_loss[loss=0.2922, simple_loss=0.3387, pruned_loss=0.1229, over 4690102.21 frames. ], batch size: 58, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:43:16,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=78826.66666666667, ans=0.125 2023-09-28 16:43:21,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:43:21,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:43:22,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:22,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:22,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:43:24,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=78826.66666666667, ans=0.0 2023-09-28 16:43:25,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:43:27,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:43:29,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:29,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:32,584 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 16:43:35,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 16:43:35,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=78893.33333333333, ans=0.1 2023-09-28 16:43:39,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:43:41,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.43 vs. limit=6.0 2023-09-28 16:43:42,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:43:45,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:45,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:43:45,395 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 16:43:46,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:55,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:43:55,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:43:55,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 16:43:57,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:44:02,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 16:44:05,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 16:44:05,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:44:07,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:44:08,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:08,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:44:11,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:44:11,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:44:12,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:44:13,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 16:44:13,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:44:13,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:13,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:44:18,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:18,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:23,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:44:25,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:44:27,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 16:44:29,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=79093.33333333333, ans=0.125 2023-09-28 16:44:30,618 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 16:44:32,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:44:35,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:36,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:44:38,728 INFO [train.py:1039] (2/4) Epoch 3, batch 1250, loss[loss=0.3062, simple_loss=0.3408, pruned_loss=0.1358, over 23618.00 frames. ], tot_loss[loss=0.293, simple_loss=0.3392, pruned_loss=0.1234, over 4700536.93 frames. ], batch size: 232, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:44:38,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:39,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=79160.0, ans=0.125 2023-09-28 16:44:40,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 16:44:45,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:44:47,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:47,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 16:44:50,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:44:50,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:44:55,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:44:55,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:57,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:44:57,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:02,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:45:07,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:45:07,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:45:07,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:08,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:08,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:11,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:12,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=79293.33333333333, ans=0.0 2023-09-28 16:45:13,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:45:18,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=79293.33333333333, ans=0.125 2023-09-28 16:45:20,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 16:45:20,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:45:25,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:26,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 16:45:26,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:45:26,808 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 16:45:26,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:26,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:30,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:33,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=79360.0, ans=0.0 2023-09-28 16:45:35,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:35,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:45:37,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 16:45:37,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 16:45:37,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 16:45:41,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:45:43,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 16:45:43,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:47,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:45:47,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:45:50,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 16:45:51,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:45:52,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:45:52,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:45:53,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:55,645 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.589e+02 2.905e+02 3.561e+02 6.488e+02, threshold=5.810e+02, percent-clipped=2.0 2023-09-28 16:45:55,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 16:45:58,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:58,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:45:59,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:46:01,419 INFO [train.py:1039] (2/4) Epoch 3, batch 1300, loss[loss=0.2895, simple_loss=0.3233, pruned_loss=0.1278, over 23669.00 frames. ], tot_loss[loss=0.2934, simple_loss=0.3395, pruned_loss=0.1237, over 4694510.73 frames. ], batch size: 232, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:46:03,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:46:06,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:46:06,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 16:46:12,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:14,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:46:14,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:17,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:46:18,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:46:18,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 16:46:24,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:46:24,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:46:25,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 16:46:26,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=79560.0, ans=0.125 2023-09-28 16:46:30,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:46:32,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:34,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:46:35,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:35,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:37,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:46:37,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:46:38,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 16:46:45,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:46:45,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:46:47,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 16:46:47,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:46:49,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=79693.33333333333, ans=0.0 2023-09-28 16:46:50,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:46:50,947 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.91 vs. limit=22.5 2023-09-28 16:46:53,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:53,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 16:46:54,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:54,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 16:46:56,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:56,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.26 vs. limit=15.0 2023-09-28 16:46:59,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:46:59,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:47:01,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=79693.33333333333, ans=0.125 2023-09-28 16:47:03,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 16:47:03,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=79693.33333333333, ans=0.125 2023-09-28 16:47:04,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 16:47:04,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 16:47:09,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:47:11,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 16:47:15,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:22,590 INFO [train.py:1039] (2/4) Epoch 3, batch 1350, loss[loss=0.3131, simple_loss=0.369, pruned_loss=0.1286, over 24452.00 frames. ], tot_loss[loss=0.2939, simple_loss=0.3392, pruned_loss=0.1243, over 4679079.22 frames. ], batch size: 69, lr: 3.05e-02, grad_scale: 32.0 2023-09-28 16:47:22,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 16:47:22,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=79826.66666666667, ans=0.125 2023-09-28 16:47:28,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:28,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:32,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:32,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:36,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:47:36,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:40,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:41,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 16:47:44,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:47:45,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:47:49,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 16:47:50,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:47:51,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:47:51,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 16:47:52,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 16:47:55,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 16:47:57,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:57,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 16:48:03,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=79960.0, ans=0.0 2023-09-28 16:48:11,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:11,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=79960.0, ans=0.125 2023-09-28 16:48:20,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:20,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=80026.66666666667, ans=0.2 2023-09-28 16:48:22,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:22,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 16:48:25,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:27,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 16:48:27,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:48:28,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:48:30,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:48:32,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=80093.33333333333, ans=0.2 2023-09-28 16:48:33,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 16:48:35,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:48:36,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=80093.33333333333, ans=0.125 2023-09-28 16:48:38,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=80093.33333333333, ans=0.125 2023-09-28 16:48:41,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 16:48:42,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 16:48:44,349 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.017e+02 2.667e+02 3.027e+02 3.668e+02 6.120e+02, threshold=6.055e+02, percent-clipped=2.0 2023-09-28 16:48:48,460 INFO [train.py:1039] (2/4) Epoch 3, batch 1400, loss[loss=0.2892, simple_loss=0.3452, pruned_loss=0.1166, over 24650.00 frames. ], tot_loss[loss=0.2923, simple_loss=0.337, pruned_loss=0.1238, over 4678409.92 frames. ], batch size: 68, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:48:49,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 16:48:50,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:52,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=80160.0, ans=0.1 2023-09-28 16:48:55,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:48:55,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:49:00,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 16:49:02,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 16:49:11,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:49:12,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:14,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:49:14,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:49:18,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:49:19,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:49:26,029 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.49 vs. limit=22.5 2023-09-28 16:49:30,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:30,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:34,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 16:49:34,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:49:34,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:49:36,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:49:37,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:39,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:49:39,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:49:39,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:49:39,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=80360.0, ans=0.0 2023-09-28 16:49:40,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 16:49:40,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:49:42,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=80360.0, ans=0.0 2023-09-28 16:49:45,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:52,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:50:00,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 16:50:02,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:50:03,455 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.94 vs. limit=15.0 2023-09-28 16:50:03,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:50:06,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:50:07,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:07,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:50:10,876 INFO [train.py:1039] (2/4) Epoch 3, batch 1450, loss[loss=0.3132, simple_loss=0.3517, pruned_loss=0.1373, over 23501.00 frames. ], tot_loss[loss=0.2908, simple_loss=0.336, pruned_loss=0.1228, over 4685045.86 frames. ], batch size: 120, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:50:12,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:50:12,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:50:12,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:13,291 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=12.0 2023-09-28 16:50:14,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:50:18,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:19,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:50:20,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:50:20,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 16:50:22,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:50:22,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=80493.33333333333, ans=0.0 2023-09-28 16:50:23,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 16:50:23,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:26,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:26,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 16:50:27,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:27,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:50:28,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 16:50:28,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:29,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:50:30,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.62 vs. limit=15.0 2023-09-28 16:50:30,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:33,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:37,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=80560.0, ans=0.0 2023-09-28 16:50:38,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:50:38,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:50:40,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:40,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:43,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:43,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:50:43,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:45,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:50:48,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 16:50:51,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:54,248 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 16:50:56,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:50:56,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:50:57,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:01,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 16:51:03,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=80693.33333333333, ans=0.0 2023-09-28 16:51:05,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:07,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 16:51:09,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 16:51:09,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=80693.33333333333, ans=0.0 2023-09-28 16:51:11,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:14,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:14,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:51:15,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 16:51:18,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 16:51:18,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 16:51:19,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:21,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:51:28,714 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 2.628e+02 3.276e+02 3.890e+02 6.376e+02, threshold=6.552e+02, percent-clipped=1.0 2023-09-28 16:51:32,264 INFO [train.py:1039] (2/4) Epoch 3, batch 1500, loss[loss=0.2355, simple_loss=0.2941, pruned_loss=0.0885, over 24304.00 frames. ], tot_loss[loss=0.2901, simple_loss=0.3357, pruned_loss=0.1222, over 4703618.64 frames. ], batch size: 56, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:51:35,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 16:51:35,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:51:35,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:51:35,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:37,171 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.23 vs. limit=15.0 2023-09-28 16:51:37,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:39,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:51:39,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 16:51:43,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:51:43,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:51:43,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:43,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=80826.66666666667, ans=0.0 2023-09-28 16:51:44,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:46,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:51:46,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 16:51:53,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:51:54,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:51:54,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:58,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 16:52:02,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 16:52:04,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:52:05,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 16:52:07,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:52:12,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:12,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:52:13,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:52:15,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 16:52:15,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:52:15,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:15,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 16:52:16,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:20,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=81026.66666666667, ans=0.1 2023-09-28 16:52:22,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:52:22,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 16:52:22,849 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.26 vs. limit=15.0 2023-09-28 16:52:28,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:52:31,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:52:33,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=81026.66666666667, ans=0.0 2023-09-28 16:52:36,340 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 16:52:37,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:37,828 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 16:52:39,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:52:42,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:52:44,312 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 16:52:44,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:52:47,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 16:52:49,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:52,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:54,272 INFO [train.py:1039] (2/4) Epoch 3, batch 1550, loss[loss=0.3222, simple_loss=0.3689, pruned_loss=0.1378, over 23770.00 frames. ], tot_loss[loss=0.2908, simple_loss=0.337, pruned_loss=0.1223, over 4717212.66 frames. ], batch size: 85, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:52:54,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:54,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:57,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 16:52:57,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 16:52:57,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:52:59,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 16:53:00,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 16:53:03,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:03,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=81160.0, ans=0.0 2023-09-28 16:53:06,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:06,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:06,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:53:07,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:09,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:12,075 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 16:53:12,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:12,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:53:13,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:53:16,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:53:16,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 16:53:18,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:18,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 16:53:20,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 16:53:20,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 16:53:21,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:23,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:25,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:53:27,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=81293.33333333333, ans=0.125 2023-09-28 16:53:29,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 16:53:29,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 16:53:37,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=81293.33333333333, ans=0.04949747468305833 2023-09-28 16:53:38,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:41,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=81293.33333333333, ans=0.1 2023-09-28 16:53:42,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:42,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:53:42,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:53:43,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=81360.0, ans=0.2 2023-09-28 16:53:44,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 16:53:48,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:53:50,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:53,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:53:55,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:53:55,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:55,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 16:53:55,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:54:00,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:54:00,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:00,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 16:54:00,583 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 16:54:03,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:08,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 16:54:13,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.706e+02 3.074e+02 3.869e+02 6.821e+02, threshold=6.147e+02, percent-clipped=1.0 2023-09-28 16:54:13,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:15,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:16,852 INFO [train.py:1039] (2/4) Epoch 3, batch 1600, loss[loss=0.2448, simple_loss=0.3003, pruned_loss=0.09461, over 24334.00 frames. ], tot_loss[loss=0.292, simple_loss=0.3385, pruned_loss=0.1227, over 4715447.09 frames. ], batch size: 56, lr: 3.03e-02, grad_scale: 32.0 2023-09-28 16:54:16,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 16:54:18,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:54:19,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:19,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:54:20,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:54:21,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:54:24,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:24,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 16:54:25,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=81493.33333333333, ans=0.2 2023-09-28 16:54:25,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=81493.33333333333, ans=0.0 2023-09-28 16:54:26,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 16:54:27,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-09-28 16:54:28,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 16:54:31,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:54:33,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 16:54:33,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:54:34,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.55 vs. limit=22.5 2023-09-28 16:54:36,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:54:41,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:54:44,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 16:54:47,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:54:48,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 16:54:49,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:49,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 16:54:54,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 16:55:01,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=81626.66666666667, ans=0.1 2023-09-28 16:55:03,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:03,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 16:55:05,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:05,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:05,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:55:06,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.22 vs. limit=22.5 2023-09-28 16:55:08,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 16:55:11,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 16:55:12,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:55:13,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:14,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:14,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:55:18,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:55:18,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:55:21,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:55:26,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.25 vs. limit=10.0 2023-09-28 16:55:27,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:29,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:55:30,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 16:55:30,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:55:33,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 16:55:37,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:39,159 INFO [train.py:1039] (2/4) Epoch 3, batch 1650, loss[loss=0.4267, simple_loss=0.4248, pruned_loss=0.2143, over 19331.00 frames. ], tot_loss[loss=0.2947, simple_loss=0.3403, pruned_loss=0.1246, over 4698470.40 frames. ], batch size: 388, lr: 3.03e-02, grad_scale: 16.0 2023-09-28 16:55:40,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:55:43,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:55:43,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 16:55:43,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 16:55:43,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 16:55:43,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 16:55:46,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:48,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:48,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:55:48,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:55:51,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:51,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=81826.66666666667, ans=0.0 2023-09-28 16:55:53,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 16:55:55,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:55,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:55,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:55:55,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:55:56,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 16:55:58,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 16:56:02,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:56:04,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:56:14,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 16:56:16,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:17,083 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.01 vs. limit=22.5 2023-09-28 16:56:17,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 16:56:19,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=81960.0, ans=0.0 2023-09-28 16:56:21,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:24,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:56:24,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:56:24,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:26,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:56:26,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:30,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:56:30,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:31,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:31,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:32,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:32,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:56:37,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:39,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 16:56:41,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:42,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 16:56:42,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 16:56:42,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 16:56:42,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:44,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:56:45,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:48,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:48,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 16:56:51,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:52,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:56:53,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:56,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 16:56:59,512 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.428e+02 2.816e+02 3.293e+02 5.315e+02, threshold=5.632e+02, percent-clipped=0.0 2023-09-28 16:57:01,804 INFO [train.py:1039] (2/4) Epoch 3, batch 1700, loss[loss=0.2943, simple_loss=0.3081, pruned_loss=0.1402, over 19143.00 frames. ], tot_loss[loss=0.294, simple_loss=0.3394, pruned_loss=0.1244, over 4693841.04 frames. ], batch size: 388, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:57:01,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:57:01,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:57:01,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 16:57:02,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:02,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:57:02,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:05,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=82160.0, ans=0.0 2023-09-28 16:57:06,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:57:06,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:57:06,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 16:57:08,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:57:16,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:18,206 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:57:20,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:57:27,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:57:27,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:57:27,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:28,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:57:32,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 16:57:34,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:57:35,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:37,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:57:37,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:57:39,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 16:57:40,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 16:57:42,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=82293.33333333333, ans=0.125 2023-09-28 16:57:43,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:45,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 16:57:45,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:57:47,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=82293.33333333333, ans=0.07 2023-09-28 16:57:53,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:57:53,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=82360.0, ans=0.125 2023-09-28 16:57:57,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:57:57,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:57:59,421 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.35 vs. limit=15.0 2023-09-28 16:58:00,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:58:00,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 16:58:00,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:58:01,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:01,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 16:58:03,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:03,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:03,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:03,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:05,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:05,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:58:07,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:09,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:58:09,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=82426.66666666667, ans=0.1 2023-09-28 16:58:11,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:14,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=82426.66666666667, ans=0.2 2023-09-28 16:58:15,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:15,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 16:58:18,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:19,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=82426.66666666667, ans=0.125 2023-09-28 16:58:20,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:22,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 16:58:23,981 INFO [train.py:1039] (2/4) Epoch 3, batch 1750, loss[loss=0.2632, simple_loss=0.2807, pruned_loss=0.1229, over 19568.00 frames. ], tot_loss[loss=0.2908, simple_loss=0.3362, pruned_loss=0.1227, over 4691354.38 frames. ], batch size: 388, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:58:25,070 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.09 vs. limit=22.5 2023-09-28 16:58:28,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:32,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:32,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:58:32,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 16:58:32,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:35,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:58:37,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:42,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 16:58:43,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:45,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 16:58:45,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:48,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:58:51,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 16:58:53,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 16:58:55,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:56,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 16:59:05,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:59:07,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:07,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:09,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=82626.66666666667, ans=0.025 2023-09-28 16:59:10,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:11,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:13,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:59:14,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:17,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:18,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:59:20,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 16:59:23,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:25,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 16:59:26,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:28,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:28,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:59:32,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:59:33,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:59:35,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:36,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:41,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:43,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:59:44,788 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.581e+02 2.939e+02 3.799e+02 7.676e+02, threshold=5.877e+02, percent-clipped=7.0 2023-09-28 16:59:46,387 INFO [train.py:1039] (2/4) Epoch 3, batch 1800, loss[loss=0.253, simple_loss=0.3104, pruned_loss=0.09781, over 24435.00 frames. ], tot_loss[loss=0.2885, simple_loss=0.3347, pruned_loss=0.1212, over 4705085.43 frames. ], batch size: 58, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 16:59:46,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:59:46,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 16:59:46,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:48,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:59:48,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:59:48,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:59:48,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:59:49,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:59:51,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:59:53,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:55,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:59:56,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:59,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:00:01,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:05,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:05,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=82893.33333333333, ans=0.125 2023-09-28 17:00:08,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:00:10,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:00:12,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 17:00:12,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:16,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:17,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=82893.33333333333, ans=0.125 2023-09-28 17:00:21,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 17:00:23,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 17:00:23,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 17:00:23,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:24,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:24,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:00:26,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:00:32,896 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 17:00:33,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=82960.0, ans=0.1 2023-09-28 17:00:34,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:00:36,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=83026.66666666667, ans=0.1 2023-09-28 17:00:37,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:37,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 17:00:39,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 17:00:39,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:00:39,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:00:41,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:00:46,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 17:00:51,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:00:52,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 17:00:52,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:00:52,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:52,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:00:53,516 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.26 vs. limit=22.5 2023-09-28 17:00:54,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 17:00:58,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:00:58,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:59,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 17:00:59,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:01,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=83093.33333333333, ans=0.125 2023-09-28 17:01:02,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:02,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:01:02,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:01:06,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:01:06,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:09,536 INFO [train.py:1039] (2/4) Epoch 3, batch 1850, loss[loss=0.2902, simple_loss=0.3276, pruned_loss=0.1264, over 23677.00 frames. ], tot_loss[loss=0.2885, simple_loss=0.335, pruned_loss=0.1211, over 4710717.28 frames. ], batch size: 164, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:01:09,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:01:11,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:17,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:01:17,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 17:01:22,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 17:01:22,727 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:01:25,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 17:01:28,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:01:28,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 17:01:28,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:01:32,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=83226.66666666667, ans=0.2 2023-09-28 17:01:40,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:01:42,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 17:01:42,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=83293.33333333333, ans=0.125 2023-09-28 17:01:45,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:01:45,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:01:49,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 17:01:49,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:49,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:01:51,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:01:53,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:56,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:00,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:02:00,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:00,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:02:00,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:02,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:04,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:02:08,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 17:02:08,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:12,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:02:14,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:02:14,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 17:02:14,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 17:02:17,697 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 17:02:19,201 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 17:02:20,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:02:20,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:02:20,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:22,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:24,518 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 17:02:24,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:02:24,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:26,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:02:27,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:02:30,307 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.984e+02 2.645e+02 2.967e+02 3.523e+02 5.465e+02, threshold=5.934e+02, percent-clipped=0.0 2023-09-28 17:02:30,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:02:30,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 17:02:31,883 INFO [train.py:1039] (2/4) Epoch 3, batch 1900, loss[loss=0.2731, simple_loss=0.324, pruned_loss=0.1111, over 24282.00 frames. ], tot_loss[loss=0.2904, simple_loss=0.3368, pruned_loss=0.122, over 4711650.98 frames. ], batch size: 56, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:02:32,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:32,160 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 17:02:32,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:02:32,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=83493.33333333333, ans=0.0 2023-09-28 17:02:33,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:39,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:41,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:02:41,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=83493.33333333333, ans=0.125 2023-09-28 17:02:43,719 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 17:02:43,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 17:02:47,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:47,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:47,455 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 17:02:48,863 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 17:02:49,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=83560.0, ans=0.125 2023-09-28 17:02:51,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 17:02:52,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:02:55,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 17:02:59,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 17:03:10,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 17:03:13,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 17:03:13,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:03:13,279 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 17:03:13,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 17:03:13,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 17:03:14,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 17:03:14,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:03:19,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 17:03:22,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:03:23,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=83693.33333333333, ans=0.125 2023-09-28 17:03:26,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:26,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 17:03:27,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:03:27,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=83693.33333333333, ans=0.125 2023-09-28 17:03:27,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=83693.33333333333, ans=0.2 2023-09-28 17:03:31,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 17:03:33,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:35,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=83693.33333333333, ans=0.1 2023-09-28 17:03:40,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:03:41,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:03:41,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:03:43,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:03:44,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:03:44,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:03:46,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:03:47,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:03:47,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:03:51,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:03:51,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:52,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:54,191 INFO [train.py:1039] (2/4) Epoch 3, batch 1950, loss[loss=0.2768, simple_loss=0.3374, pruned_loss=0.1081, over 23894.00 frames. ], tot_loss[loss=0.2898, simple_loss=0.3364, pruned_loss=0.1215, over 4719115.19 frames. ], batch size: 86, lr: 3.00e-02, grad_scale: 16.0 2023-09-28 17:03:54,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:03:59,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=83826.66666666667, ans=0.0 2023-09-28 17:04:00,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:02,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:04:02,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:02,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:04:05,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 17:04:05,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:04:06,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.89 vs. limit=22.5 2023-09-28 17:04:07,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:07,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:10,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:04:10,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:10,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:14,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:17,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:17,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:04:17,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:04:17,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:21,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:24,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:04:24,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:24,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:04:24,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 17:04:26,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:04:27,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:04:27,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:31,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:31,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=83960.0, ans=0.2 2023-09-28 17:04:35,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:04:40,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:04:44,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:04:44,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:04:44,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 17:04:45,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:04:49,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:50,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:04:52,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:04:57,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=84026.66666666667, ans=0.09899494936611666 2023-09-28 17:04:59,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:00,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:03,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:06,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:08,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:05:09,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:09,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 17:05:09,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:05:11,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:05:11,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=84093.33333333333, ans=0.125 2023-09-28 17:05:12,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 17:05:14,259 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.050e+02 2.608e+02 2.981e+02 3.638e+02 7.272e+02, threshold=5.963e+02, percent-clipped=1.0 2023-09-28 17:05:14,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:16,365 INFO [train.py:1039] (2/4) Epoch 3, batch 2000, loss[loss=0.4116, simple_loss=0.4134, pruned_loss=0.2049, over 19552.00 frames. ], tot_loss[loss=0.2903, simple_loss=0.3369, pruned_loss=0.1218, over 4717410.83 frames. ], batch size: 388, lr: 3.00e-02, grad_scale: 32.0 2023-09-28 17:05:18,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:05:19,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:05:19,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:05:21,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:05:23,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:26,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 17:05:26,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:05:29,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:05:30,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 17:05:31,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:05:31,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:34,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:05:36,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 17:05:38,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:40,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:40,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:41,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 17:05:41,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:05:44,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 17:05:44,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:48,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:05:49,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:05:49,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:51,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:05:51,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:05:53,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 17:05:55,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=84293.33333333333, ans=0.2 2023-09-28 17:05:58,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 17:05:58,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:58,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:00,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.14 vs. limit=22.5 2023-09-28 17:06:02,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.18 vs. limit=15.0 2023-09-28 17:06:04,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:05,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:06:05,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:05,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:06:08,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:09,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:09,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:09,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:11,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:12,466 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.01 vs. limit=15.0 2023-09-28 17:06:15,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:06:15,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 17:06:18,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=84360.0, ans=0.0 2023-09-28 17:06:19,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:06:21,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:26,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:27,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:06:30,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:33,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:33,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:34,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:06:35,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:06:38,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:39,524 INFO [train.py:1039] (2/4) Epoch 3, batch 2050, loss[loss=0.2884, simple_loss=0.3401, pruned_loss=0.1184, over 24439.00 frames. ], tot_loss[loss=0.2893, simple_loss=0.3361, pruned_loss=0.1213, over 4729944.55 frames. ], batch size: 63, lr: 2.99e-02, grad_scale: 32.0 2023-09-28 17:06:39,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:43,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:43,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:50,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:53,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:06:53,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:53,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.43 vs. limit=15.0 2023-09-28 17:06:54,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:06:55,824 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.04 vs. limit=15.0 2023-09-28 17:06:56,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 17:06:56,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:06:58,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:58,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:07:10,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:10,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:13,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 17:07:15,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:15,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 17:07:17,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:18,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:21,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:22,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:07:22,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:22,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=84626.66666666667, ans=0.0 2023-09-28 17:07:23,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:07:23,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=84626.66666666667, ans=0.0 2023-09-28 17:07:25,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:07:25,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:07:28,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:30,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:07:32,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:07:35,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:07:36,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=84693.33333333333, ans=0.2 2023-09-28 17:07:37,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=84693.33333333333, ans=0.2 2023-09-28 17:07:39,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=84693.33333333333, ans=0.125 2023-09-28 17:07:40,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:07:45,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:07:46,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 17:07:51,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:07:53,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:07:56,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:07:59,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 17:08:01,426 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 17:08:01,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:02,769 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.062e+02 2.817e+02 3.171e+02 3.803e+02 7.947e+02, threshold=6.342e+02, percent-clipped=1.0 2023-09-28 17:08:02,811 INFO [train.py:1039] (2/4) Epoch 3, batch 2100, loss[loss=0.2428, simple_loss=0.2995, pruned_loss=0.0931, over 24322.00 frames. ], tot_loss[loss=0.2874, simple_loss=0.3343, pruned_loss=0.1202, over 4723445.78 frames. ], batch size: 56, lr: 2.99e-02, grad_scale: 16.0 2023-09-28 17:08:02,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:03,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:04,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:08:04,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 17:08:04,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 17:08:07,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:08:09,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:08:11,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:08:15,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:16,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:08:16,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 17:08:16,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:08:16,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 17:08:16,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 17:08:19,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:19,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:08:19,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 17:08:19,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=84893.33333333333, ans=0.1 2023-09-28 17:08:20,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:08:28,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 17:08:28,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:28,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=84893.33333333333, ans=0.2 2023-09-28 17:08:28,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=84893.33333333333, ans=0.125 2023-09-28 17:08:31,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:08:31,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:34,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:08:36,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 17:08:36,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:36,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 17:08:36,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=84960.0, ans=0.1 2023-09-28 17:08:39,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 17:08:39,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:39,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 17:08:41,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 17:08:41,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 17:08:42,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:08:44,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:08:48,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:50,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:53,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:54,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:54,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 17:08:54,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:54,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:55,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:55,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 17:08:57,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 17:08:58,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 17:09:03,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:09:08,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:09:09,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 17:09:11,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=85093.33333333333, ans=0.125 2023-09-28 17:09:15,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:16,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:09:18,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:09:18,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:09:18,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 17:09:18,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:09:19,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:19,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:09:21,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:09:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:23,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 17:09:25,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.88 vs. limit=8.0 2023-09-28 17:09:25,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 17:09:25,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:27,905 INFO [train.py:1039] (2/4) Epoch 3, batch 2150, loss[loss=0.2833, simple_loss=0.3239, pruned_loss=0.1213, over 23613.00 frames. ], tot_loss[loss=0.2868, simple_loss=0.3344, pruned_loss=0.1196, over 4725980.49 frames. ], batch size: 256, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:09:31,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:09:31,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:09:31,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:09:32,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:09:37,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:09:40,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:40,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:42,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.57 vs. limit=10.0 2023-09-28 17:09:43,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:09:43,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:43,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:09:47,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:47,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:09:47,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:09:52,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:52,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 17:09:57,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:09:59,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:10:01,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:01,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:10:04,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:04,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:10:04,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=85293.33333333333, ans=0.0 2023-09-28 17:10:05,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:10:06,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 17:10:07,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=85293.33333333333, ans=0.125 2023-09-28 17:10:08,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:10:10,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:10,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:11,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:10:13,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:10:16,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:16,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:10:18,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:18,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 17:10:18,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:10:21,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:21,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:21,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=85360.0, ans=0.0 2023-09-28 17:10:22,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:23,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:10:24,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:24,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:24,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=85360.0, ans=0.2 2023-09-28 17:10:26,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 17:10:28,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 17:10:28,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:10:29,669 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 17:10:29,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:29,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:10:31,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 17:10:31,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:10:31,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 17:10:31,318 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 17:10:31,318 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 17:10:33,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 17:10:35,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:35,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:35,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:10:37,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:38,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:10:40,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:40,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:42,891 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:10:48,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:10:50,163 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.981e+02 2.450e+02 2.912e+02 3.382e+02 5.716e+02, threshold=5.824e+02, percent-clipped=0.0 2023-09-28 17:10:50,207 INFO [train.py:1039] (2/4) Epoch 3, batch 2200, loss[loss=0.2384, simple_loss=0.2989, pruned_loss=0.08895, over 24606.00 frames. ], tot_loss[loss=0.2871, simple_loss=0.3345, pruned_loss=0.1199, over 4713983.10 frames. ], batch size: 60, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:10:50,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 17:10:52,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=85493.33333333333, ans=0.125 2023-09-28 17:10:53,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:10:56,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:58,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:10:59,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:01,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:11:05,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:11:05,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=85560.0, ans=0.125 2023-09-28 17:11:06,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:11:06,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 17:11:12,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 17:11:14,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:11:20,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 17:11:21,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:23,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:23,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:11:26,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=85626.66666666667, ans=0.0 2023-09-28 17:11:27,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:11:28,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 17:11:31,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:11:32,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:34,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:11:38,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:11:39,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:41,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:11:42,148 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.43 vs. limit=22.5 2023-09-28 17:11:42,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:45,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 17:11:46,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:48,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 17:11:50,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:50,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:11:52,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:53,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:55,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:55,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:55,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:58,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:11:58,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:11:59,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:12:01,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=85760.0, ans=0.07 2023-09-28 17:12:02,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:12:02,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:06,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:12:07,677 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 17:12:09,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:12:11,172 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 17:12:13,190 INFO [train.py:1039] (2/4) Epoch 3, batch 2250, loss[loss=0.2948, simple_loss=0.3408, pruned_loss=0.1244, over 23707.00 frames. ], tot_loss[loss=0.2874, simple_loss=0.335, pruned_loss=0.1199, over 4716268.97 frames. ], batch size: 232, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:12:13,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:12:13,370 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 17:12:14,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:15,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:12:16,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:18,535 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 17:12:21,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:12:23,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:28,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:12:29,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:12:34,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:34,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:34,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:36,645 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=12.0 2023-09-28 17:12:37,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 17:12:37,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:12:37,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:12:40,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 17:12:40,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:12:40,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:43,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:47,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:49,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:12:49,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:12:51,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 17:12:52,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:55,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:13:01,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:04,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:13:07,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:13:09,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:13:11,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=86026.66666666667, ans=0.125 2023-09-28 17:13:11,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=86026.66666666667, ans=0.125 2023-09-28 17:13:12,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:13:15,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:13:22,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:13:22,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:13:22,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:13:29,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:13:29,670 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.14 vs. limit=22.5 2023-09-28 17:13:32,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:13:32,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 17:13:32,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:32,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:13:34,492 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:13:35,621 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.481e+02 2.992e+02 3.507e+02 5.214e+02, threshold=5.985e+02, percent-clipped=0.0 2023-09-28 17:13:35,666 INFO [train.py:1039] (2/4) Epoch 3, batch 2300, loss[loss=0.2642, simple_loss=0.3183, pruned_loss=0.1051, over 24444.00 frames. ], tot_loss[loss=0.2898, simple_loss=0.3369, pruned_loss=0.1214, over 4713032.92 frames. ], batch size: 63, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:13:35,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 17:13:38,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:13:38,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:41,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=86160.0, ans=0.1 2023-09-28 17:13:43,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:43,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:13:45,556 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 17:13:47,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:55,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:13:55,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:13:55,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:13:57,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:57,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 17:13:59,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:14:02,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:02,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:14:07,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:14:09,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=86293.33333333333, ans=0.1 2023-09-28 17:14:10,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:14:13,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:15,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=86293.33333333333, ans=0.0 2023-09-28 17:14:21,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:14:21,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:14:24,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:14:26,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:14:31,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:32,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:14:33,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:14:33,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 17:14:38,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:14:38,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:38,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:38,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:14:38,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:40,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:14:40,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:14:40,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 17:14:40,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:14:40,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:43,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 17:14:49,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:14:52,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:14:57,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:57,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:14:57,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:14:58,584 INFO [train.py:1039] (2/4) Epoch 3, batch 2350, loss[loss=0.2921, simple_loss=0.3424, pruned_loss=0.1209, over 23293.00 frames. ], tot_loss[loss=0.2899, simple_loss=0.3372, pruned_loss=0.1213, over 4711944.85 frames. ], batch size: 93, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:14:58,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:14:58,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:00,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:15:00,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 17:15:02,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=86493.33333333333, ans=0.1 2023-09-28 17:15:04,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=86493.33333333333, ans=0.125 2023-09-28 17:15:07,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:07,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 17:15:09,969 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.69 vs. limit=15.0 2023-09-28 17:15:12,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 17:15:16,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:15:19,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:19,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:21,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 17:15:24,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:15:30,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 17:15:30,541 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.81 vs. limit=22.5 2023-09-28 17:15:33,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:34,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:15:34,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:38,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:15:39,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 17:15:42,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:15:45,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:45,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:15:45,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:15:48,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:15:50,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 17:15:52,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:53,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:55,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:15:56,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 17:15:56,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:16:01,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 17:16:01,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:16:06,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 17:16:07,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 17:16:09,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:16:09,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:16:10,615 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 17:16:10,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 17:16:13,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 17:16:16,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:16:16,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=86760.0, ans=0.2 2023-09-28 17:16:20,374 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.689e+02 3.044e+02 3.623e+02 6.836e+02, threshold=6.088e+02, percent-clipped=1.0 2023-09-28 17:16:20,417 INFO [train.py:1039] (2/4) Epoch 3, batch 2400, loss[loss=0.3173, simple_loss=0.3405, pruned_loss=0.147, over 23785.00 frames. ], tot_loss[loss=0.29, simple_loss=0.3369, pruned_loss=0.1216, over 4697087.83 frames. ], batch size: 212, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:16:20,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=86826.66666666667, ans=0.125 2023-09-28 17:16:21,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:16:25,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.16 vs. limit=15.0 2023-09-28 17:16:27,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:16:28,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:16:28,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 17:16:29,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=86826.66666666667, ans=0.125 2023-09-28 17:16:30,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 17:16:35,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=86893.33333333333, ans=0.2 2023-09-28 17:16:36,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:16:36,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:16:39,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 17:16:39,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:16:39,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:40,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 17:16:44,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:49,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 17:16:56,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:17:00,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 17:17:05,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:05,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:09,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:09,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 17:17:09,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:17:16,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:19,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:21,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:23,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:17:23,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:17:23,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:17:23,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:24,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:24,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:17:29,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:17:31,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:17:31,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 17:17:34,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 17:17:35,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:36,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=87093.33333333333, ans=0.1 2023-09-28 17:17:37,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:37,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 17:17:38,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 17:17:38,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 17:17:38,823 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 17:17:38,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 17:17:40,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:40,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:40,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:42,251 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 17:17:43,660 INFO [train.py:1039] (2/4) Epoch 3, batch 2450, loss[loss=0.2486, simple_loss=0.3062, pruned_loss=0.09545, over 24327.00 frames. ], tot_loss[loss=0.2874, simple_loss=0.3342, pruned_loss=0.1203, over 4686288.67 frames. ], batch size: 56, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:17:43,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:43,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:17:45,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=87160.0, ans=0.125 2023-09-28 17:17:48,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:17:48,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:53,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:53,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:53,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 17:17:57,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:57,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:03,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:18:03,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:18:03,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:18:03,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 17:18:09,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:11,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:18:12,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:18:15,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:18:15,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:15,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:17,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:18:18,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 17:18:20,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:18:29,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:29,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:31,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:31,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:18:31,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=87293.33333333333, ans=0.0 2023-09-28 17:18:33,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:35,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:18:36,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 17:18:40,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:40,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:18:42,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:18:43,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:49,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:18:49,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 17:18:51,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:18:52,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:18:52,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 17:18:54,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:18:54,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:18:54,921 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.62 vs. limit=15.0 2023-09-28 17:18:57,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:19:00,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:00,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:19:04,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 17:19:05,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:19:07,574 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.571e+02 3.066e+02 3.811e+02 5.963e+02, threshold=6.132e+02, percent-clipped=0.0 2023-09-28 17:19:07,617 INFO [train.py:1039] (2/4) Epoch 3, batch 2500, loss[loss=0.2602, simple_loss=0.3175, pruned_loss=0.1015, over 24632.00 frames. ], tot_loss[loss=0.2867, simple_loss=0.3338, pruned_loss=0.1197, over 4698804.45 frames. ], batch size: 60, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:19:12,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:22,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:19:22,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:19:23,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:23,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 17:19:28,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=87560.0, ans=0.125 2023-09-28 17:19:31,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:19:33,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:19:33,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:19:35,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:19:37,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 17:19:37,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:37,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=87560.0, ans=0.1 2023-09-28 17:19:38,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:38,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 17:19:38,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:39,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 17:19:40,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:44,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:44,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:48,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:19:49,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 17:19:51,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:19:51,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:56,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:59,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:02,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:08,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:20:13,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 17:20:13,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:13,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:14,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:20:14,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:20:15,031 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 17:20:15,031 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 17:20:15,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 17:20:19,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:22,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 17:20:22,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 17:20:23,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:24,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 17:20:27,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 17:20:30,842 INFO [train.py:1039] (2/4) Epoch 3, batch 2550, loss[loss=0.247, simple_loss=0.3085, pruned_loss=0.09278, over 24627.00 frames. ], tot_loss[loss=0.2874, simple_loss=0.3343, pruned_loss=0.1202, over 4690864.07 frames. ], batch size: 60, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:20:31,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:33,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:20:35,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:20:35,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:37,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 17:20:37,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:20:41,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 17:20:43,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:20:46,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:47,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:47,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 17:20:49,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:20:49,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:20:49,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:53,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:20:53,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 17:20:53,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:53,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:53,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 17:20:56,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=87893.33333333333, ans=0.125 2023-09-28 17:21:08,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:21:12,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=87960.0, ans=0.0 2023-09-28 17:21:13,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:13,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:13,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:21:15,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:21:22,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:21:25,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:21:25,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:21:25,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:21:25,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:21:25,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:21:29,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:29,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:34,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:21:36,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 17:21:36,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:21:37,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:37,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:21:39,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:21:40,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:49,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:21:51,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:53,271 INFO [train.py:1039] (2/4) Epoch 3, batch 2600, loss[loss=0.2851, simple_loss=0.3387, pruned_loss=0.1158, over 23160.00 frames. ], tot_loss[loss=0.2893, simple_loss=0.3363, pruned_loss=0.1211, over 4699843.27 frames. ], batch size: 93, lr: 2.95e-02, grad_scale: 16.0 2023-09-28 17:21:53,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=88160.0, ans=0.2 2023-09-28 17:21:53,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=88160.0, ans=0.95 2023-09-28 17:21:54,708 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.952e+02 2.618e+02 3.140e+02 3.668e+02 6.690e+02, threshold=6.281e+02, percent-clipped=1.0 2023-09-28 17:21:55,567 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 17:21:58,537 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 17:21:58,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:22:00,051 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 17:22:00,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 17:22:00,197 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 17:22:03,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:22:03,332 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 17:22:05,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 17:22:07,042 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 17:22:09,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:22:10,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 17:22:12,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 17:22:13,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:22:13,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 17:22:16,819 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 17:22:16,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 17:22:24,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:24,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:24,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:24,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 17:22:27,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:22:35,226 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 17:22:37,485 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.20 vs. limit=22.5 2023-09-28 17:22:38,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=88293.33333333333, ans=0.125 2023-09-28 17:22:39,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:42,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:42,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 17:22:42,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:22:42,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:44,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 17:22:46,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:22:46,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:22:47,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:52,171 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 17:22:52,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=88360.0, ans=0.125 2023-09-28 17:22:53,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:53,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:23:00,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:23:00,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:23:00,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 17:23:00,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=88426.66666666667, ans=0.0 2023-09-28 17:23:01,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:23:03,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:04,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:11,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 17:23:11,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:14,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:23:16,460 INFO [train.py:1039] (2/4) Epoch 3, batch 2650, loss[loss=0.3073, simple_loss=0.3405, pruned_loss=0.1371, over 23823.00 frames. ], tot_loss[loss=0.2889, simple_loss=0.3364, pruned_loss=0.1207, over 4720489.51 frames. ], batch size: 164, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:23:20,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 17:23:21,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:21,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:23:23,339 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 17:23:23,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:23:24,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:28,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:23:29,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:29,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=88493.33333333333, ans=0.2 2023-09-28 17:23:32,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:23:34,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 17:23:34,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:23:34,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:23:37,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 17:23:39,579 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 17:23:43,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:46,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 17:23:46,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:23:46,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 17:23:50,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:50,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:23:51,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:51,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:23:56,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 17:23:58,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 17:23:59,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:02,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 17:24:02,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:02,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:02,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:04,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:04,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:06,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:08,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:09,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:24:09,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:24:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:24:13,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:14,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:24:14,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:16,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:16,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:24:20,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:22,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:24:24,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:24,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 17:24:27,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:29,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:32,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:34,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:35,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:35,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:37,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:24:37,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 17:24:38,876 INFO [train.py:1039] (2/4) Epoch 3, batch 2700, loss[loss=0.2823, simple_loss=0.3223, pruned_loss=0.1212, over 23796.00 frames. ], tot_loss[loss=0.2903, simple_loss=0.3371, pruned_loss=0.1217, over 4722267.31 frames. ], batch size: 164, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:24:40,991 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.674e+02 3.068e+02 3.788e+02 5.664e+02, threshold=6.136e+02, percent-clipped=0.0 2023-09-28 17:24:41,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:24:42,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:24:44,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:45,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=7.06 vs. limit=12.0 2023-09-28 17:24:46,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:46,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:46,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.88 vs. limit=22.5 2023-09-28 17:24:49,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:24:49,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:49,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:24:49,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:24:50,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 17:24:52,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:24:52,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:54,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:24:54,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:58,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:25:00,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 17:25:00,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:05,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:25:05,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:12,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:25:12,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:25:12,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=88960.0, ans=0.0 2023-09-28 17:25:14,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:25:14,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:25:17,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:21,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:22,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:25:22,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:25:27,512 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.20 vs. limit=15.0 2023-09-28 17:25:27,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:27,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:25:34,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:25:36,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:25:38,716 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.06 vs. limit=15.0 2023-09-28 17:25:39,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:25:39,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:25:44,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:44,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:46,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:48,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:25:49,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:49,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:25:53,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:54,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:54,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:57,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 17:25:59,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:00,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=89093.33333333333, ans=0.0 2023-09-28 17:26:02,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.50 vs. limit=15.0 2023-09-28 17:26:02,665 INFO [train.py:1039] (2/4) Epoch 3, batch 2750, loss[loss=0.2462, simple_loss=0.3046, pruned_loss=0.09388, over 24249.00 frames. ], tot_loss[loss=0.288, simple_loss=0.3356, pruned_loss=0.1202, over 4729753.67 frames. ], batch size: 56, lr: 2.93e-02, grad_scale: 16.0 2023-09-28 17:26:02,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:26:02,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 17:26:04,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 17:26:04,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:07,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:07,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:10,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:10,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:26:10,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:15,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:17,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:26:17,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:26:17,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:17,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 17:26:17,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:26:17,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:23,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 17:26:27,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:26:27,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:29,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:26:29,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:26:30,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:32,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:26:32,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:33,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:34,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=89293.33333333333, ans=0.125 2023-09-28 17:26:37,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:26:37,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:26:39,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:26:39,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:42,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:26:47,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:49,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:26:49,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:54,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:54,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:26:54,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:26:54,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=89360.0, ans=0.1 2023-09-28 17:27:01,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:27:03,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:27:03,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 17:27:07,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:09,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 17:27:14,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:27:17,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:27:17,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 17:27:19,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:27:23,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:27:23,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 17:27:23,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:27:26,168 INFO [train.py:1039] (2/4) Epoch 3, batch 2800, loss[loss=0.2663, simple_loss=0.3299, pruned_loss=0.1014, over 24480.00 frames. ], tot_loss[loss=0.2861, simple_loss=0.3337, pruned_loss=0.1192, over 4725538.61 frames. ], batch size: 66, lr: 2.93e-02, grad_scale: 32.0 2023-09-28 17:27:26,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=89493.33333333333, ans=0.035 2023-09-28 17:27:27,576 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.563e+02 3.005e+02 3.573e+02 5.260e+02, threshold=6.010e+02, percent-clipped=0.0 2023-09-28 17:27:27,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:27:27,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:27,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:27:29,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 17:27:29,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:29,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:31,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 17:27:32,587 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 17:27:35,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:37,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:27:37,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:27:42,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:27:44,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 17:27:47,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:27:49,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 17:27:50,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:50,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:27:50,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:27:54,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:27:54,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:54,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:27:56,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:04,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:28:05,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=89626.66666666667, ans=0.0 2023-09-28 17:28:07,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:08,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=89626.66666666667, ans=0.0 2023-09-28 17:28:10,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:10,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:28:11,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:17,252 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.27 vs. limit=15.0 2023-09-28 17:28:17,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:17,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 17:28:18,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:21,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:21,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:28:24,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:25,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:30,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:32,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:28:32,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:32,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:28:32,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:28:32,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:28:34,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:28:34,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 17:28:34,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:36,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:36,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:38,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 17:28:39,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:39,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:28:40,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:28:43,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 17:28:49,329 INFO [train.py:1039] (2/4) Epoch 3, batch 2850, loss[loss=0.2549, simple_loss=0.3176, pruned_loss=0.09612, over 24312.00 frames. ], tot_loss[loss=0.2846, simple_loss=0.3325, pruned_loss=0.1184, over 4706804.33 frames. ], batch size: 61, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:28:49,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:49,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:28:51,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:28:53,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:28:56,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:28:56,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:56,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:29:01,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:01,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:29:02,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:29:02,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 17:29:10,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 17:29:10,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:12,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 17:29:13,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:14,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=89893.33333333333, ans=0.125 2023-09-28 17:29:16,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=89893.33333333333, ans=0.125 2023-09-28 17:29:17,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 17:29:17,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 17:29:19,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:31,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:33,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:29:34,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:29:34,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:29:34,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:29:37,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:29:37,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 17:29:41,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:29:41,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:29:41,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:41,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:44,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:48,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:51,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:29:52,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:52,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:53,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:29:58,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:30:00,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 17:30:00,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 17:30:03,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:30:05,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:05,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 17:30:05,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:30:06,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:06,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:06,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:30:06,920 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 17:30:08,373 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 17:30:08,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:08,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:13,031 INFO [train.py:1039] (2/4) Epoch 3, batch 2900, loss[loss=0.2708, simple_loss=0.325, pruned_loss=0.1083, over 24280.00 frames. ], tot_loss[loss=0.2852, simple_loss=0.3331, pruned_loss=0.1187, over 4693622.97 frames. ], batch size: 61, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:30:13,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:15,033 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.599e+02 2.941e+02 3.399e+02 5.344e+02, threshold=5.883e+02, percent-clipped=0.0 2023-09-28 17:30:15,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:15,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:30:17,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 17:30:22,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:22,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 17:30:22,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 17:30:24,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:30:24,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:30:26,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:27,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:30:31,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:32,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:37,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:30:37,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 17:30:38,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:30:40,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:40,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=90226.66666666667, ans=0.1 2023-09-28 17:30:43,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 17:30:43,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 17:30:48,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:48,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 17:30:48,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:30:49,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:30:51,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:51,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:51,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=90293.33333333333, ans=0.125 2023-09-28 17:30:53,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:55,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:58,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:00,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 17:31:00,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 17:31:00,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:31:04,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.51 vs. limit=22.5 2023-09-28 17:31:04,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:31:06,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 17:31:06,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:31:11,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:31:12,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=90360.0, ans=0.125 2023-09-28 17:31:12,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=90360.0, ans=0.0 2023-09-28 17:31:21,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:31:21,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:31:23,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 17:31:27,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:27,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 17:31:28,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:29,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:31:35,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:36,392 INFO [train.py:1039] (2/4) Epoch 3, batch 2950, loss[loss=0.2959, simple_loss=0.3325, pruned_loss=0.1296, over 23800.00 frames. ], tot_loss[loss=0.2851, simple_loss=0.3337, pruned_loss=0.1183, over 4703156.01 frames. ], batch size: 212, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:31:36,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 17:31:38,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:38,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:39,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:31:41,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:31:43,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 17:31:44,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 17:31:46,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:31:46,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:48,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=90493.33333333333, ans=0.2 2023-09-28 17:31:52,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:31:55,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:31:57,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:31:57,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:31:59,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=90560.0, ans=0.125 2023-09-28 17:31:59,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=90560.0, ans=0.1 2023-09-28 17:32:02,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:02,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:32:04,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:32:07,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 17:32:10,376 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.97 vs. limit=15.0 2023-09-28 17:32:12,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 17:32:12,772 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 17:32:12,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:32:14,454 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 17:32:14,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=90626.66666666667, ans=0.125 2023-09-28 17:32:16,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 17:32:16,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:32:16,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=90626.66666666667, ans=0.125 2023-09-28 17:32:17,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:32:17,962 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 17:32:17,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:32:18,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=90626.66666666667, ans=0.2 2023-09-28 17:32:22,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 17:32:22,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:32:22,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:32:25,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:28,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:32:28,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:28,797 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 17:32:28,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:29,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=90693.33333333333, ans=0.0 2023-09-28 17:32:30,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 17:32:36,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:36,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=15.0 2023-09-28 17:32:38,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:32:38,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 17:32:38,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:32:40,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 17:32:43,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:43,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:45,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:32:46,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:46,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:32:47,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:32:48,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:48,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:32:49,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:32:49,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:52,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:32:54,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:54,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 17:32:55,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=90760.0, ans=0.125 2023-09-28 17:32:56,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:59,383 INFO [train.py:1039] (2/4) Epoch 3, batch 3000, loss[loss=0.2942, simple_loss=0.3353, pruned_loss=0.1266, over 23713.00 frames. ], tot_loss[loss=0.2848, simple_loss=0.3334, pruned_loss=0.118, over 4696368.07 frames. ], batch size: 212, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:32:59,383 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 17:33:13,926 INFO [train.py:1071] (2/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3326, pruned_loss=0.2311, over 1125622.00 frames. 2023-09-28 17:33:13,927 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 17:33:15,404 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.502e+02 2.937e+02 3.419e+02 4.607e+02, threshold=5.874e+02, percent-clipped=0.0 2023-09-28 17:33:15,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:33:16,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:33:18,700 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 17:33:20,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 17:33:23,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:33:23,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:33:24,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 17:33:24,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:32,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:33:42,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:33:46,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=90960.0, ans=0.125 2023-09-28 17:33:48,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 17:33:50,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:33:54,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:33:54,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:54,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:33:57,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:33:57,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 17:34:00,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 17:34:03,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:34:03,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:34:05,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:34:05,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:07,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:07,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:34:10,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:34:10,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:34:10,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:34:11,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:13,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 17:34:14,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:34:14,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:16,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:34:21,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:21,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:22,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:34:22,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 17:34:25,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:34:25,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 17:34:25,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:34:30,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 17:34:31,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:34:33,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:34:33,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 17:34:34,789 INFO [train.py:1039] (2/4) Epoch 3, batch 3050, loss[loss=0.2651, simple_loss=0.3371, pruned_loss=0.09659, over 24489.00 frames. ], tot_loss[loss=0.2857, simple_loss=0.3345, pruned_loss=0.1185, over 4714126.97 frames. ], batch size: 69, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:34:34,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 17:34:34,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:34:35,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=91160.0, ans=0.05 2023-09-28 17:34:37,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:34:38,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:38,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:34:38,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:40,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:34:40,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=91160.0, ans=0.2 2023-09-28 17:34:41,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 17:34:43,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:34:46,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:34:47,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:34:50,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:54,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 17:35:02,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 17:35:02,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 17:35:02,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:07,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:35:09,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:09,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:09,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=91293.33333333333, ans=0.025 2023-09-28 17:35:11,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:14,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:16,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:35:16,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:16,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:16,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:17,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:20,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:22,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:22,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 17:35:23,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:23,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:35:24,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=91360.0, ans=0.125 2023-09-28 17:35:27,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:35:27,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:35:27,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:35:29,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:33,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:33,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:39,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:40,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:35:40,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:42,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:42,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:35:42,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:44,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 17:35:46,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:46,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:47,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 17:35:50,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:55,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:57,534 INFO [train.py:1039] (2/4) Epoch 3, batch 3100, loss[loss=0.2928, simple_loss=0.3392, pruned_loss=0.1232, over 23404.00 frames. ], tot_loss[loss=0.2855, simple_loss=0.3347, pruned_loss=0.1182, over 4715294.02 frames. ], batch size: 93, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:35:57,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:35:59,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:36:00,683 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.573e+02 3.095e+02 3.783e+02 7.787e+02, threshold=6.189e+02, percent-clipped=2.0 2023-09-28 17:36:00,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 17:36:01,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=91493.33333333333, ans=0.125 2023-09-28 17:36:03,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 17:36:05,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 17:36:07,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:36:10,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:36:12,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:13,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:36:19,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:25,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 17:36:29,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:36:31,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:32,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:36:33,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:36:33,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:36:35,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:36:35,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 17:36:35,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:36:36,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:39,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 17:36:39,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:36:43,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:36:43,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 17:36:45,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 17:36:47,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:47,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:50,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:36:50,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:50,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:36:54,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:36:54,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:36:54,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:36:55,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:36:55,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:55,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 17:37:00,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:37:01,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 17:37:05,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:37:05,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 17:37:06,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:07,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:08,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 17:37:19,672 INFO [train.py:1039] (2/4) Epoch 3, batch 3150, loss[loss=0.3094, simple_loss=0.3469, pruned_loss=0.1359, over 23823.00 frames. ], tot_loss[loss=0.2849, simple_loss=0.3334, pruned_loss=0.1182, over 4706154.03 frames. ], batch size: 164, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:37:19,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 17:37:20,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=91826.66666666667, ans=0.0 2023-09-28 17:37:22,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:23,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:25,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:37:25,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:37:25,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 17:37:27,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:27,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:37:28,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 17:37:30,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:32,299 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 17:37:36,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 17:37:36,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:37:39,104 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 17:37:39,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:37:40,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 17:37:40,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 17:37:40,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 17:37:40,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:40,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:37:42,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:45,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 17:37:46,285 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.96 vs. limit=15.0 2023-09-28 17:37:46,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:47,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:48,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:50,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:37:54,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 17:37:54,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:37:57,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:37:57,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:59,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 17:38:00,268 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=18.58 vs. limit=22.5 2023-09-28 17:38:02,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 17:38:04,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:38:04,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:38:04,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:38:06,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:06,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:38:06,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:38:06,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=91960.0, ans=0.0 2023-09-28 17:38:07,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:38:09,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 17:38:09,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:38:09,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:11,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:38:11,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:38:13,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 17:38:13,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:14,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 17:38:16,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:17,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 17:38:18,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=92026.66666666667, ans=0.125 2023-09-28 17:38:19,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 17:38:20,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:38:20,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:20,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 17:38:22,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:38:22,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:25,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:38:26,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=92093.33333333333, ans=0.0 2023-09-28 17:38:27,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:27,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:38:34,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:38:34,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:37,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 17:38:43,052 INFO [train.py:1039] (2/4) Epoch 3, batch 3200, loss[loss=0.2488, simple_loss=0.3115, pruned_loss=0.09306, over 24651.00 frames. ], tot_loss[loss=0.2837, simple_loss=0.3322, pruned_loss=0.1176, over 4697871.07 frames. ], batch size: 65, lr: 2.90e-02, grad_scale: 32.0 2023-09-28 17:38:43,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:38:43,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:38:46,886 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.531e+02 2.998e+02 3.452e+02 5.958e+02, threshold=5.995e+02, percent-clipped=0.0 2023-09-28 17:38:47,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:48,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:38:48,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 17:38:51,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:54,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:38:58,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=92226.66666666667, ans=0.2 2023-09-28 17:38:59,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:39:08,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:39:17,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=92293.33333333333, ans=0.1 2023-09-28 17:39:19,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 17:39:21,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:39:21,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=92293.33333333333, ans=0.0 2023-09-28 17:39:24,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 17:39:24,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:39:24,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=92293.33333333333, ans=0.0 2023-09-28 17:39:26,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=92293.33333333333, ans=0.2 2023-09-28 17:39:27,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:39:27,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:39:29,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:39:32,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 17:39:34,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:39:38,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 17:39:43,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 17:39:44,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:39:51,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:51,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:39:51,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:51,320 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 17:39:51,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:39:55,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:39:56,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 17:39:56,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 17:39:58,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 17:39:59,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 17:40:01,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:40:04,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=92493.33333333333, ans=0.2 2023-09-28 17:40:05,800 INFO [train.py:1039] (2/4) Epoch 3, batch 3250, loss[loss=0.2658, simple_loss=0.3183, pruned_loss=0.1066, over 24506.00 frames. ], tot_loss[loss=0.2842, simple_loss=0.3326, pruned_loss=0.1179, over 4702436.90 frames. ], batch size: 63, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:40:05,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:40:05,897 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 17:40:05,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:05,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:07,489 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 17:40:09,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:40:10,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=92493.33333333333, ans=0.125 2023-09-28 17:40:14,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=92493.33333333333, ans=0.0 2023-09-28 17:40:15,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:15,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=92493.33333333333, ans=0.1 2023-09-28 17:40:22,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:40:22,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 17:40:23,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:23,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:40:25,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:27,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:27,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:40:30,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:40:30,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:30,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:40:32,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:33,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:35,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:35,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:37,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:37,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:37,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:40:40,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=92626.66666666667, ans=0.0 2023-09-28 17:40:42,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 17:40:43,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:43,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:40:46,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:46,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:40:46,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=92626.66666666667, ans=0.1 2023-09-28 17:40:47,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=92626.66666666667, ans=0.1 2023-09-28 17:40:52,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:41:00,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:00,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:00,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 17:41:00,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:41:02,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:41:02,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:02,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=92693.33333333333, ans=0.1 2023-09-28 17:41:05,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 17:41:05,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 17:41:06,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:41:06,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:08,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:08,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:41:09,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:12,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:12,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:15,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 17:41:15,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:18,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:41:18,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 17:41:23,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:23,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 17:41:25,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 17:41:25,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=92760.0, ans=0.125 2023-09-28 17:41:26,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 17:41:26,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:29,901 INFO [train.py:1039] (2/4) Epoch 3, batch 3300, loss[loss=0.2841, simple_loss=0.3416, pruned_loss=0.1133, over 24501.00 frames. ], tot_loss[loss=0.2861, simple_loss=0.3339, pruned_loss=0.1191, over 4700202.67 frames. ], batch size: 66, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:41:30,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:31,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:41:31,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:33,735 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.895e+02 2.576e+02 3.097e+02 3.556e+02 6.978e+02, threshold=6.193e+02, percent-clipped=2.0 2023-09-28 17:41:34,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:41:35,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:41:37,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:40,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:40,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=92826.66666666667, ans=0.0 2023-09-28 17:41:43,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 17:41:43,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=92826.66666666667, ans=0.2 2023-09-28 17:41:44,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:41:44,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:47,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:47,688 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 17:41:49,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:41:49,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:41:49,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=92893.33333333333, ans=0.125 2023-09-28 17:41:49,543 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:41:52,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:41:52,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:41:52,148 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 17:41:58,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:58,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:42:01,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:01,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 17:42:03,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 17:42:03,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:04,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:42:06,477 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 17:42:08,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=92960.0, ans=0.0 2023-09-28 17:42:09,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 17:42:09,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:13,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 17:42:17,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:19,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:42:20,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:22,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.91 vs. limit=15.0 2023-09-28 17:42:23,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:23,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:23,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:42:23,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:42:23,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=93026.66666666667, ans=0.035 2023-09-28 17:42:26,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:42:26,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:26,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:42:28,294 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 17:42:31,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 17:42:32,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:42:32,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:42:32,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:34,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:34,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:36,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:42:37,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:37,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:42:37,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:39,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:42:42,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 17:42:44,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:44,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:47,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:42:47,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:50,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:52,057 INFO [train.py:1039] (2/4) Epoch 3, batch 3350, loss[loss=0.3372, simple_loss=0.3539, pruned_loss=0.1602, over 23560.00 frames. ], tot_loss[loss=0.2856, simple_loss=0.334, pruned_loss=0.1186, over 4712755.64 frames. ], batch size: 256, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:42:52,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:52,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:53,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:55,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:56,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:59,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:02,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:43:05,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:05,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:43:06,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 17:43:08,354 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 17:43:08,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:14,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 17:43:14,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 17:43:14,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:43:16,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:43:16,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:17,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 17:43:17,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:17,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:43:18,755 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.65 vs. limit=15.0 2023-09-28 17:43:20,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:22,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:22,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:24,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:43:27,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:29,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:30,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:33,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:43:35,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:39,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:39,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:42,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:45,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 17:43:45,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:43:45,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 17:43:45,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:43:47,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 17:43:49,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:49,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=93360.0, ans=0.0 2023-09-28 17:43:50,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:50,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=93360.0, ans=0.125 2023-09-28 17:43:57,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:57,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 17:43:57,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:43:59,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:44:00,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:44:02,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=93426.66666666667, ans=0.0 2023-09-28 17:44:05,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:08,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 17:44:08,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=93426.66666666667, ans=0.07 2023-09-28 17:44:10,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:44:10,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:44:11,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:12,979 INFO [train.py:1039] (2/4) Epoch 3, batch 3400, loss[loss=0.3359, simple_loss=0.3703, pruned_loss=0.1508, over 23940.00 frames. ], tot_loss[loss=0.2865, simple_loss=0.335, pruned_loss=0.1189, over 4719995.86 frames. ], batch size: 196, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:44:13,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 17:44:13,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:44:13,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 17:44:16,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:16,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:16,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:44:17,385 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.863e+02 2.557e+02 2.981e+02 3.725e+02 6.496e+02, threshold=5.961e+02, percent-clipped=1.0 2023-09-28 17:44:17,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:44:18,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 17:44:22,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 17:44:22,685 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 17:44:22,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:44:28,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:28,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:44:28,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:29,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:44:35,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:44:37,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 17:44:43,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:44:45,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:46,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:46,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:44:55,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:45:00,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 17:45:04,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:04,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=93693.33333333333, ans=0.125 2023-09-28 17:45:05,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:05,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 17:45:07,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:07,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:07,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:45:08,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:45:11,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:45:16,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:45:16,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:45:22,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:24,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 17:45:31,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=93760.0, ans=0.2 2023-09-28 17:45:33,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:45:36,559 INFO [train.py:1039] (2/4) Epoch 3, batch 3450, loss[loss=0.2627, simple_loss=0.3127, pruned_loss=0.1064, over 24429.00 frames. ], tot_loss[loss=0.2855, simple_loss=0.335, pruned_loss=0.118, over 4729191.58 frames. ], batch size: 58, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:45:38,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 17:45:42,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 17:45:43,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:45,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:45:45,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 17:45:46,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:49,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:45:55,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:45:55,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:45:55,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:45:55,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:59,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:04,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 17:46:12,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 17:46:12,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:46:12,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:46:13,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:20,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 17:46:21,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:46:24,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=94026.66666666667, ans=0.0 2023-09-28 17:46:25,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:25,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:46:25,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=94026.66666666667, ans=0.125 2023-09-28 17:46:27,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:46:28,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:46:30,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 17:46:30,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:46:30,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:33,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.25 vs. limit=15.0 2023-09-28 17:46:34,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=94026.66666666667, ans=0.2 2023-09-28 17:46:35,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:46:39,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 17:46:42,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:46:49,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:46:49,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:52,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:46:57,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:57,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:57,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:46:58,895 INFO [train.py:1039] (2/4) Epoch 3, batch 3500, loss[loss=0.2981, simple_loss=0.3069, pruned_loss=0.1446, over 19470.00 frames. ], tot_loss[loss=0.2834, simple_loss=0.3329, pruned_loss=0.117, over 4719995.82 frames. ], batch size: 388, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:46:58,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:47:03,554 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.532e+02 3.066e+02 3.931e+02 6.870e+02, threshold=6.132e+02, percent-clipped=2.0 2023-09-28 17:47:03,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:07,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:47:07,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 17:47:09,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:47:14,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 17:47:15,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:15,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 17:47:23,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:47:23,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:47:25,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:47:25,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:25,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:47:25,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:26,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:26,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 17:47:28,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=94226.66666666667, ans=0.125 2023-09-28 17:47:29,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:29,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:47:29,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:33,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:34,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 17:47:36,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:39,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:41,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:47:43,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:45,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:47:45,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:45,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 17:47:46,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 17:47:48,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 17:47:49,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:49,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:50,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=94360.0, ans=0.2 2023-09-28 17:47:52,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:53,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:47:56,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:47:56,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:48:02,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:04,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 17:48:04,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 17:48:04,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:06,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:07,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:09,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:12,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 17:48:12,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:14,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:48:16,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 17:48:18,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 17:48:21,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:22,668 INFO [train.py:1039] (2/4) Epoch 3, batch 3550, loss[loss=0.2555, simple_loss=0.2701, pruned_loss=0.1205, over 19152.00 frames. ], tot_loss[loss=0.2826, simple_loss=0.3318, pruned_loss=0.1167, over 4710333.86 frames. ], batch size: 388, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:48:22,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:22,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:24,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:27,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:48:39,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:42,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:48:43,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:45,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:48:46,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:47,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=94560.0, ans=0.1 2023-09-28 17:48:49,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:48:49,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:48:52,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:48:52,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:52,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:48:54,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:48:59,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:48:59,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:49:02,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:02,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:49:04,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:49:04,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 17:49:04,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:04,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:06,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:49:12,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:14,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:49:14,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:16,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 17:49:17,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:49:19,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 17:49:21,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:23,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:49:24,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:49:26,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 17:49:27,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:34,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:35,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 17:49:36,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:43,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:44,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 17:49:46,232 INFO [train.py:1039] (2/4) Epoch 3, batch 3600, loss[loss=0.2926, simple_loss=0.3354, pruned_loss=0.1249, over 23430.00 frames. ], tot_loss[loss=0.2821, simple_loss=0.3316, pruned_loss=0.1163, over 4710868.81 frames. ], batch size: 119, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:49:50,960 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.527e+02 2.760e+02 3.413e+02 5.643e+02, threshold=5.521e+02, percent-clipped=0.0 2023-09-28 17:49:51,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 17:49:52,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:49:54,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:49:55,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:55,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:57,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:50:00,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:01,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=94893.33333333333, ans=0.125 2023-09-28 17:50:02,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:02,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:50:04,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:50:04,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:04,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 17:50:09,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:50:10,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:12,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=94893.33333333333, ans=0.125 2023-09-28 17:50:14,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:17,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:19,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:50:19,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:19,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 17:50:21,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:21,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:22,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:50:25,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:27,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:29,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:50:29,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=94960.0, ans=0.1 2023-09-28 17:50:30,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 17:50:35,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:50:37,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:50:37,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 17:50:42,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:50:48,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:53,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:59,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:50:59,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:50:59,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 17:51:01,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 17:51:01,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 17:51:03,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:51:03,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=95093.33333333333, ans=0.125 2023-09-28 17:51:04,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:51:06,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 17:51:06,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:07,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:51:07,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:08,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 17:51:09,411 INFO [train.py:1039] (2/4) Epoch 3, batch 3650, loss[loss=0.354, simple_loss=0.373, pruned_loss=0.1675, over 19545.00 frames. ], tot_loss[loss=0.2814, simple_loss=0.3311, pruned_loss=0.1158, over 4706980.01 frames. ], batch size: 388, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:51:09,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 17:51:12,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:51:14,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 17:51:19,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 17:51:21,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:51:24,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 17:51:24,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 17:51:29,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:51:29,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:51:29,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:51:32,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:51:34,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:34,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 17:51:36,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:51:37,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:38,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 17:51:38,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=95226.66666666667, ans=0.125 2023-09-28 17:51:39,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:51:41,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:51:41,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:41,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:51:44,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 17:51:44,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 17:51:45,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:51:48,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 17:51:49,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:51:49,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:51:56,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:51:57,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:57,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:51:59,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:52:00,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=95360.0, ans=0.0 2023-09-28 17:52:01,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:52:03,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:52:06,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:08,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:08,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:52:10,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:52:12,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:52:12,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:18,685 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 17:52:23,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:23,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:24,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=95426.66666666667, ans=0.125 2023-09-28 17:52:25,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:52:25,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:26,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:52:28,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:30,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 17:52:30,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:31,693 INFO [train.py:1039] (2/4) Epoch 3, batch 3700, loss[loss=0.2851, simple_loss=0.322, pruned_loss=0.1242, over 23905.00 frames. ], tot_loss[loss=0.2835, simple_loss=0.3328, pruned_loss=0.1171, over 4705607.15 frames. ], batch size: 195, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:52:33,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:52:35,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:37,004 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.521e+02 2.916e+02 3.663e+02 5.180e+02, threshold=5.833e+02, percent-clipped=0.0 2023-09-28 17:52:37,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:52:38,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:38,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 17:52:38,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:39,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=95493.33333333333, ans=0.125 2023-09-28 17:52:40,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:52:40,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:52:43,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:52:46,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:48,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:49,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:52:49,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:51,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:52:52,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:55,027 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 17:53:01,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:53:01,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:53:03,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:53:04,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 17:53:04,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:08,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:09,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 17:53:13,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:13,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:53:16,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:18,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:53:21,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:53:26,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:26,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 17:53:27,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:53:27,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 17:53:29,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=95693.33333333333, ans=0.125 2023-09-28 17:53:31,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=95693.33333333333, ans=0.0 2023-09-28 17:53:33,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:53:33,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:53:36,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:36,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 17:53:39,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:53:39,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:53:39,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:39,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:45,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:45,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 17:53:46,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=95760.0, ans=0.2 2023-09-28 17:53:47,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 17:53:47,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:53:47,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:53:49,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:53:51,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:53:54,565 INFO [train.py:1039] (2/4) Epoch 3, batch 3750, loss[loss=0.2931, simple_loss=0.3305, pruned_loss=0.1278, over 23594.00 frames. ], tot_loss[loss=0.2844, simple_loss=0.3338, pruned_loss=0.1175, over 4701072.08 frames. ], batch size: 256, lr: 2.85e-02, grad_scale: 32.0 2023-09-28 17:53:54,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:54,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:53:57,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:53:57,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=95826.66666666667, ans=0.035 2023-09-28 17:53:59,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=95826.66666666667, ans=0.2 2023-09-28 17:54:00,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 17:54:00,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 17:54:04,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:54:04,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 17:54:05,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:54:06,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=95826.66666666667, ans=0.0 2023-09-28 17:54:06,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=95826.66666666667, ans=0.125 2023-09-28 17:54:07,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:08,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:10,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=95893.33333333333, ans=0.125 2023-09-28 17:54:11,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:14,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:18,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:54:18,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:54:20,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:54:20,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=95893.33333333333, ans=0.125 2023-09-28 17:54:23,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:23,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 17:54:25,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:27,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:27,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:30,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 17:54:35,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 17:54:36,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=95960.0, ans=0.125 2023-09-28 17:54:37,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:37,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:37,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=95960.0, ans=0.125 2023-09-28 17:54:40,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:45,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:45,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:54:50,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 17:54:52,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:52,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=96026.66666666667, ans=0.04949747468305833 2023-09-28 17:54:57,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:57,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:55:00,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:55:04,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:55:06,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:55:06,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=96093.33333333333, ans=0.0 2023-09-28 17:55:09,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:55:09,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-09-28 17:55:11,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:55:14,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:55:17,121 INFO [train.py:1039] (2/4) Epoch 3, batch 3800, loss[loss=0.2584, simple_loss=0.3318, pruned_loss=0.09252, over 24708.00 frames. ], tot_loss[loss=0.2834, simple_loss=0.3331, pruned_loss=0.1168, over 4705714.80 frames. ], batch size: 73, lr: 2.85e-02, grad_scale: 16.0 2023-09-28 17:55:23,803 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.013e+02 2.428e+02 2.901e+02 3.496e+02 5.183e+02, threshold=5.803e+02, percent-clipped=0.0 2023-09-28 17:55:23,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:55:26,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:27,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:55:27,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 17:55:28,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:30,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:32,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:55:33,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:55:33,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:36,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:55:37,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:37,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:55:37,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:39,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 17:55:41,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=96226.66666666667, ans=0.125 2023-09-28 17:55:43,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:55:43,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:55:46,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=96226.66666666667, ans=0.125 2023-09-28 17:55:47,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:49,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:55:49,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:55:52,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:55:52,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:56,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:56,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=96293.33333333333, ans=0.125 2023-09-28 17:55:57,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:56:03,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:56:03,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 17:56:05,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:09,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=96360.0, ans=0.09899494936611666 2023-09-28 17:56:12,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:16,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:56:20,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 17:56:22,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 17:56:23,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:25,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:25,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:26,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 17:56:30,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 17:56:31,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 17:56:31,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:33,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:36,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=96426.66666666667, ans=0.125 2023-09-28 17:56:39,348 INFO [train.py:1039] (2/4) Epoch 3, batch 3850, loss[loss=0.2472, simple_loss=0.3093, pruned_loss=0.0926, over 24503.00 frames. ], tot_loss[loss=0.2829, simple_loss=0.3318, pruned_loss=0.117, over 4710498.97 frames. ], batch size: 66, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:56:39,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:56:41,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:56:46,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:56:47,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 17:56:48,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:56:48,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:53,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:56:58,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:59,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:57:01,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 17:57:08,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:09,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:57:11,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:13,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:57:16,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:18,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:57:20,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:20,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:57:20,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:21,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:57:22,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 17:57:23,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 17:57:23,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:23,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:25,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=96626.66666666667, ans=0.125 2023-09-28 17:57:25,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=96626.66666666667, ans=0.0 2023-09-28 17:57:26,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:26,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:26,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 17:57:30,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 17:57:31,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:33,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 17:57:36,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:57:42,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:43,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:48,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:49,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 17:57:51,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 17:57:53,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:54,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:58,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:57:58,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:57:59,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:58:01,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 17:58:01,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=96826.66666666667, ans=0.09899494936611666 2023-09-28 17:58:02,457 INFO [train.py:1039] (2/4) Epoch 3, batch 3900, loss[loss=0.3039, simple_loss=0.3369, pruned_loss=0.1355, over 23821.00 frames. ], tot_loss[loss=0.2812, simple_loss=0.3308, pruned_loss=0.1158, over 4716165.95 frames. ], batch size: 164, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:58:02,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:58:04,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 17:58:04,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:04,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:07,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:58:07,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:09,131 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.471e+02 2.886e+02 3.509e+02 5.748e+02, threshold=5.772e+02, percent-clipped=0.0 2023-09-28 17:58:09,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:58:10,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:10,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:58:10,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:10,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 17:58:12,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:14,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=96826.66666666667, ans=0.125 2023-09-28 17:58:15,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:15,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:15,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:58:17,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:20,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:21,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:25,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:58:26,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 17:58:26,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:28,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 17:58:28,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:29,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 17:58:31,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 17:58:33,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=96893.33333333333, ans=0.1 2023-09-28 17:58:34,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:36,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:36,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:58:37,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:58:41,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:43,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:58:45,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:58:45,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:58:46,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=96960.0, ans=0.125 2023-09-28 17:58:47,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:58:51,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=97026.66666666667, ans=0.0 2023-09-28 17:58:54,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:55,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:59:03,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:59:05,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:59:15,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:18,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:19,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 17:59:20,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 17:59:20,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:20,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=97093.33333333333, ans=0.2 2023-09-28 17:59:21,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 17:59:23,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:59:25,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 17:59:27,206 INFO [train.py:1039] (2/4) Epoch 3, batch 3950, loss[loss=0.2548, simple_loss=0.3185, pruned_loss=0.09553, over 24432.00 frames. ], tot_loss[loss=0.2801, simple_loss=0.3299, pruned_loss=0.1152, over 4703498.36 frames. ], batch size: 63, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:59:33,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:59:34,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 17:59:35,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:59:36,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=97160.0, ans=0.125 2023-09-28 17:59:38,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:59:39,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:59:40,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=97160.0, ans=0.0 2023-09-28 17:59:44,600 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 17:59:45,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:46,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 17:59:47,495 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 17:59:47,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:51,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:52,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:59:52,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:55,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 17:59:55,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=97226.66666666667, ans=0.0 2023-09-28 17:59:57,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:59:57,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:57,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:59:58,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:59:58,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:00:12,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:00:12,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:00:17,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 18:00:22,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=97360.0, ans=0.125 2023-09-28 18:00:23,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 18:00:23,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 18:00:23,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:00:25,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:00:30,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=97360.0, ans=0.125 2023-09-28 18:00:33,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:00:33,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:00:33,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:00:33,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:00:35,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 18:00:41,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:00:42,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:00:46,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 18:00:50,719 INFO [train.py:1039] (2/4) Epoch 3, batch 4000, loss[loss=0.2911, simple_loss=0.3273, pruned_loss=0.1274, over 23582.00 frames. ], tot_loss[loss=0.2812, simple_loss=0.3307, pruned_loss=0.1158, over 4711669.07 frames. ], batch size: 232, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:00:55,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:00:56,961 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.653e+02 3.032e+02 3.720e+02 5.555e+02, threshold=6.065e+02, percent-clipped=0.0 2023-09-28 18:01:02,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=97493.33333333333, ans=0.1 2023-09-28 18:01:03,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:09,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:09,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:01:09,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=97560.0, ans=0.0 2023-09-28 18:01:10,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:10,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 18:01:12,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:01:12,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 18:01:12,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:01:12,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=97560.0, ans=0.125 2023-09-28 18:01:14,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 18:01:16,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:19,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:01:20,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:01:20,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:01:20,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:20,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:01:22,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:01:24,496 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 18:01:25,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:01:27,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:30,360 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 18:01:31,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:01:31,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:38,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 18:01:38,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:40,381 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:01:41,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:01:41,649 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 18:01:42,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.65 vs. limit=22.5 2023-09-28 18:01:43,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:01:43,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 18:01:43,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:01:45,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:47,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:01:49,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:01:49,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:01:49,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:49,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 18:01:50,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:52,578 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 18:01:58,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:02:03,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 18:02:05,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:02:06,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:06,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:02:08,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:12,359 INFO [train.py:1039] (2/4) Epoch 3, batch 4050, loss[loss=0.2638, simple_loss=0.3292, pruned_loss=0.09919, over 24361.00 frames. ], tot_loss[loss=0.2803, simple_loss=0.33, pruned_loss=0.1153, over 4719238.10 frames. ], batch size: 77, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:02:16,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:19,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:02:19,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 18:02:20,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:02:22,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:02:24,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:02:24,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:27,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:30,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:31,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:02:32,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 18:02:34,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:02:35,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:02:39,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:40,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:43,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:02:45,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 18:02:45,679 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 18:02:50,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:02:57,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 18:02:59,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:03:01,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:04,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:03:05,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:03:05,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:08,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:03:12,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 18:03:12,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:03:13,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:13,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=98026.66666666667, ans=0.2 2023-09-28 18:03:15,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 18:03:17,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=98093.33333333333, ans=0.1 2023-09-28 18:03:18,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:25,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 18:03:27,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:03:27,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:03:27,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=98093.33333333333, ans=0.125 2023-09-28 18:03:28,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 18:03:30,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 18:03:30,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:32,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:03:32,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=98093.33333333333, ans=0.0 2023-09-28 18:03:33,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:33,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:03:35,372 INFO [train.py:1039] (2/4) Epoch 3, batch 4100, loss[loss=0.2931, simple_loss=0.3381, pruned_loss=0.124, over 23523.00 frames. ], tot_loss[loss=0.2806, simple_loss=0.3304, pruned_loss=0.1153, over 4729150.57 frames. ], batch size: 134, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:03:42,059 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.855e+02 2.385e+02 2.703e+02 3.359e+02 5.329e+02, threshold=5.406e+02, percent-clipped=0.0 2023-09-28 18:03:43,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 18:03:45,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 18:03:47,300 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.74 vs. limit=15.0 2023-09-28 18:03:48,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 18:03:49,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 18:03:49,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:49,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:49,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:51,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:03:52,741 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 18:03:55,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:03:56,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:03:58,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:58,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:03:59,044 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.03 vs. limit=6.0 2023-09-28 18:04:02,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:04:02,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:04:02,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:04:04,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 18:04:05,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:05,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:04:05,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:05,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:04:06,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 18:04:09,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:11,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 18:04:11,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=98293.33333333333, ans=0.0 2023-09-28 18:04:12,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:04:13,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=98293.33333333333, ans=0.0 2023-09-28 18:04:16,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:16,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 18:04:18,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:04:18,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:04:18,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:04:19,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 18:04:22,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:04:24,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:04:25,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 18:04:27,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:27,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:29,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:31,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=98360.0, ans=0.05 2023-09-28 18:04:35,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:04:39,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:39,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:04:40,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.77 vs. limit=12.0 2023-09-28 18:04:45,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=98426.66666666667, ans=0.125 2023-09-28 18:04:48,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:04:48,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:51,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:53,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:04:57,940 INFO [train.py:1039] (2/4) Epoch 3, batch 4150, loss[loss=0.2791, simple_loss=0.3376, pruned_loss=0.1103, over 24649.00 frames. ], tot_loss[loss=0.28, simple_loss=0.33, pruned_loss=0.115, over 4724212.72 frames. ], batch size: 68, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:04:58,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:59,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:04:59,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:04:59,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:04,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 18:05:04,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:05,561 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.72 vs. limit=6.0 2023-09-28 18:05:06,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 18:05:07,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 18:05:08,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 18:05:09,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=98493.33333333333, ans=0.95 2023-09-28 18:05:10,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:14,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:05:15,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:15,717 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.27 vs. limit=15.0 2023-09-28 18:05:18,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:19,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:05:19,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:05:21,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:05:21,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:23,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:05:25,278 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.52 vs. limit=15.0 2023-09-28 18:05:27,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:29,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=98626.66666666667, ans=0.1 2023-09-28 18:05:29,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.35 vs. limit=15.0 2023-09-28 18:05:32,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:33,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 18:05:35,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 18:05:36,151 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.09 vs. limit=22.5 2023-09-28 18:05:36,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:05:36,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 18:05:36,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:05:36,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:05:42,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:05:42,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:46,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 18:05:49,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=98693.33333333333, ans=0.1 2023-09-28 18:05:50,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:05:51,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:05:53,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 18:05:54,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:57,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 18:05:57,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:05:58,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:06:00,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:01,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 18:06:01,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:01,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:06:03,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:06:06,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 18:06:06,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:06,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:06:06,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:06:08,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 18:06:08,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:06:08,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 18:06:09,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:06:11,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:11,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 18:06:13,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:06:17,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:06:19,348 INFO [train.py:1039] (2/4) Epoch 3, batch 4200, loss[loss=0.2563, simple_loss=0.3105, pruned_loss=0.101, over 23476.00 frames. ], tot_loss[loss=0.2789, simple_loss=0.3289, pruned_loss=0.1145, over 4728831.72 frames. ], batch size: 119, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:06:19,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 18:06:19,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:06:22,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:24,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:06:24,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:24,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:26,715 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.537e+02 2.926e+02 3.391e+02 4.648e+02, threshold=5.852e+02, percent-clipped=0.0 2023-09-28 18:06:26,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 18:06:28,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 18:06:30,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:33,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:35,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:06:36,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=98893.33333333333, ans=0.125 2023-09-28 18:06:37,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:06:38,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.22 vs. limit=6.0 2023-09-28 18:06:39,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:06:39,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=98893.33333333333, ans=0.125 2023-09-28 18:06:40,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:42,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 18:06:42,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:44,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:44,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:44,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:06:45,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:06:46,448 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.95 vs. limit=10.0 2023-09-28 18:06:48,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 18:06:48,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:49,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=98893.33333333333, ans=0.0 2023-09-28 18:06:56,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:06:57,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:06:59,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:07:02,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:04,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=98960.0, ans=0.125 2023-09-28 18:07:05,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:07:05,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 18:07:05,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:07,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:07:07,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=99026.66666666667, ans=0.0 2023-09-28 18:07:12,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:07:13,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:20,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:07:21,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 18:07:25,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:25,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=99093.33333333333, ans=0.1 2023-09-28 18:07:30,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:07:31,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:33,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 18:07:40,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:07:41,857 INFO [train.py:1039] (2/4) Epoch 3, batch 4250, loss[loss=0.2616, simple_loss=0.3313, pruned_loss=0.09598, over 24450.00 frames. ], tot_loss[loss=0.2773, simple_loss=0.327, pruned_loss=0.1138, over 4719592.53 frames. ], batch size: 69, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:07:45,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:45,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:07:46,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:51,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:07:52,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 18:07:52,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:56,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:00,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:02,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=99226.66666666667, ans=0.0 2023-09-28 18:08:03,175 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.82 vs. limit=12.0 2023-09-28 18:08:05,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:05,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:07,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=99226.66666666667, ans=0.1 2023-09-28 18:08:08,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:08:08,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:08,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:10,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:12,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:15,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:08:16,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.99 vs. limit=12.0 2023-09-28 18:08:16,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:18,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 18:08:18,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=99293.33333333333, ans=0.125 2023-09-28 18:08:21,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 18:08:21,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:22,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:22,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:24,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:08:24,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:24,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:28,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:08:28,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:08:33,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:08:35,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:35,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 18:08:35,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:08:37,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 18:08:39,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.01 vs. limit=15.0 2023-09-28 18:08:40,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:08:41,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:08:43,772 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.38 vs. limit=15.0 2023-09-28 18:08:44,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:44,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:46,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 18:08:46,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:08:48,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:08:52,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:55,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:57,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:08:58,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:02,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:03,675 INFO [train.py:1039] (2/4) Epoch 3, batch 4300, loss[loss=0.2825, simple_loss=0.3447, pruned_loss=0.1101, over 24639.00 frames. ], tot_loss[loss=0.2791, simple_loss=0.3287, pruned_loss=0.1147, over 4721655.13 frames. ], batch size: 73, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:09:03,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:09:05,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:05,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 18:09:06,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:12,365 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.893e+02 2.623e+02 3.036e+02 3.611e+02 5.200e+02, threshold=6.071e+02, percent-clipped=0.0 2023-09-28 18:09:12,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:12,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:17,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:19,459 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.78 vs. limit=10.0 2023-09-28 18:09:23,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:09:23,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 18:09:25,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:09:28,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:09:28,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:09:28,513 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 18:09:31,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:09:33,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:09:36,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 18:09:36,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:09:36,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 18:09:40,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:09:41,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:09:45,225 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:09:47,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:09:47,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:47,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:09:48,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:50,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:50,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 18:09:51,299 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.57 vs. limit=15.0 2023-09-28 18:09:51,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 18:09:53,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:55,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:55,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:09:55,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:56,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:56,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 18:09:56,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 18:09:56,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 18:09:57,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=99693.33333333333, ans=0.1 2023-09-28 18:09:58,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:09:58,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 18:10:00,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 18:10:03,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:03,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 18:10:05,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:10:05,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=99693.33333333333, ans=0.0 2023-09-28 18:10:07,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:07,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:10,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 18:10:10,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:10:10,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:10,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:10,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:10,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:10:12,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:10:14,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=99760.0, ans=0.125 2023-09-28 18:10:15,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:16,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:16,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:23,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 18:10:23,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:10:26,388 INFO [train.py:1039] (2/4) Epoch 3, batch 4350, loss[loss=0.2657, simple_loss=0.3328, pruned_loss=0.09928, over 24646.00 frames. ], tot_loss[loss=0.2815, simple_loss=0.3307, pruned_loss=0.1162, over 4709240.65 frames. ], batch size: 68, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:10:26,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=99826.66666666667, ans=0.0 2023-09-28 18:10:29,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:10:31,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:34,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:10:34,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:10:40,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:10:42,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=99893.33333333333, ans=0.1 2023-09-28 18:10:44,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:47,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:10:47,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:48,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:10:53,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:10:54,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=99893.33333333333, ans=0.125 2023-09-28 18:10:54,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=99893.33333333333, ans=0.0 2023-09-28 18:10:55,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:11:01,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 18:11:02,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:02,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:08,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:10,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 18:11:15,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:18,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:11:20,793 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 18:11:20,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:22,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:11:24,070 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 18:11:24,179 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 18:11:24,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:24,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:25,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:11:27,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:29,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:29,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:11:32,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 18:11:32,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:32,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:33,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:33,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 18:11:35,382 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 18:11:35,388 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 18:11:35,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 18:11:38,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:11:38,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:11:39,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:11:39,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:11:41,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 18:11:43,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=100093.33333333333, ans=0.2 2023-09-28 18:11:45,058 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 18:11:45,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:49,528 INFO [train.py:1039] (2/4) Epoch 3, batch 4400, loss[loss=0.2883, simple_loss=0.3294, pruned_loss=0.1235, over 23431.00 frames. ], tot_loss[loss=0.2814, simple_loss=0.3309, pruned_loss=0.116, over 4705921.61 frames. ], batch size: 285, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:11:49,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:11:50,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:51,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:56,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 18:11:56,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 18:11:56,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 18:11:56,328 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 18:11:57,550 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.556e+02 3.170e+02 3.495e+02 5.491e+02, threshold=6.340e+02, percent-clipped=0.0 2023-09-28 18:11:57,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:11:57,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:12:01,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 18:12:01,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:04,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:04,465 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 18:12:07,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:07,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 18:12:07,660 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 18:12:10,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 18:12:12,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 18:12:12,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 18:12:12,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:12,923 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.03 vs. limit=15.0 2023-09-28 18:12:13,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:16,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 18:12:16,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 18:12:16,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:20,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:12:20,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:20,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:21,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:21,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 18:12:23,890 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 18:12:27,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=100293.33333333333, ans=0.2 2023-09-28 18:12:29,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:35,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=100293.33333333333, ans=0.125 2023-09-28 18:12:37,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:38,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=100360.0, ans=0.125 2023-09-28 18:12:40,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 18:12:44,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:12:46,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=100360.0, ans=0.2 2023-09-28 18:12:47,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:12:49,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:12:50,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 18:12:50,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:12:51,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:12:51,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:12:52,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:12:57,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 18:13:02,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 18:13:02,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 18:13:02,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:04,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 18:13:04,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:13:07,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:13:09,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 18:13:10,892 INFO [train.py:1039] (2/4) Epoch 3, batch 4450, loss[loss=0.2666, simple_loss=0.3316, pruned_loss=0.1008, over 24095.00 frames. ], tot_loss[loss=0.2819, simple_loss=0.3318, pruned_loss=0.116, over 4717442.14 frames. ], batch size: 80, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:13:12,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:13:16,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:16,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:13:23,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:23,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:13:26,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:28,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:13:28,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:13:28,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:30,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 18:13:30,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:32,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:32,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:13:32,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:13:35,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:13:42,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:43,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:45,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:45,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:47,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:13:52,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:13:53,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 18:13:53,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 18:13:53,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:13:55,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:57,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 18:14:00,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=100693.33333333333, ans=0.0 2023-09-28 18:14:01,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:14:04,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:04,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 18:14:04,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:04,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:05,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:14:05,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:14:07,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:09,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=100693.33333333333, ans=0.125 2023-09-28 18:14:10,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:14:10,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 18:14:13,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:14:14,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:14:16,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=100760.0, ans=0.125 2023-09-28 18:14:17,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:19,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:19,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:14:21,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:14:26,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 18:14:27,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:14:31,866 INFO [train.py:1039] (2/4) Epoch 3, batch 4500, loss[loss=0.2474, simple_loss=0.3032, pruned_loss=0.09576, over 24463.00 frames. ], tot_loss[loss=0.2827, simple_loss=0.3323, pruned_loss=0.1166, over 4719963.30 frames. ], batch size: 58, lr: 2.79e-02, grad_scale: 32.0 2023-09-28 18:14:33,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:33,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=100826.66666666667, ans=0.04949747468305833 2023-09-28 18:14:34,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 18:14:34,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 18:14:35,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=100826.66666666667, ans=0.2 2023-09-28 18:14:36,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:40,305 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.564e+02 2.888e+02 3.333e+02 4.958e+02, threshold=5.777e+02, percent-clipped=0.0 2023-09-28 18:14:40,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:42,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:42,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:14:44,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:14:44,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:45,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:54,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=100893.33333333333, ans=0.2 2023-09-28 18:14:54,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=100893.33333333333, ans=0.125 2023-09-28 18:14:59,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:59,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:15:01,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=100893.33333333333, ans=0.0 2023-09-28 18:15:03,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:04,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:15:05,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:15:12,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:15:17,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:15:21,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:15:26,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:15:26,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 18:15:26,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:27,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:29,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:29,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:29,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=101026.66666666667, ans=0.125 2023-09-28 18:15:32,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:15:32,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 18:15:32,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:15:34,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:34,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=101026.66666666667, ans=0.125 2023-09-28 18:15:37,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:15:38,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:15:40,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:43,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:15:43,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:15:45,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 18:15:48,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 18:15:48,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 18:15:50,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=101093.33333333333, ans=0.025 2023-09-28 18:15:53,571 INFO [train.py:1039] (2/4) Epoch 3, batch 4550, loss[loss=0.2757, simple_loss=0.3434, pruned_loss=0.1041, over 24458.00 frames. ], tot_loss[loss=0.2809, simple_loss=0.33, pruned_loss=0.1159, over 4704001.42 frames. ], batch size: 69, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:15:53,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 18:15:55,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 18:15:56,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:15:58,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:15:59,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:16:03,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:08,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:16:11,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:16:12,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:12,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:16:12,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:15,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:15,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:16:19,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:22,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 18:16:22,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 18:16:25,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:16:26,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 18:16:30,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 18:16:30,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:33,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 18:16:36,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:16:39,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:16:42,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 18:16:44,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:16:47,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:47,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:48,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:50,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 18:16:52,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 18:16:52,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:16:53,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 18:16:57,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 18:16:57,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:57,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:59,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:59,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:59,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:17:01,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:17:01,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 18:17:03,551 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.55 vs. limit=15.0 2023-09-28 18:17:04,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:17:04,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:17:05,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 18:17:05,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:17:05,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 18:17:07,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=101426.66666666667, ans=0.0 2023-09-28 18:17:08,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:17:08,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:17:10,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:17:10,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:17:10,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:17:12,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:17:15,554 INFO [train.py:1039] (2/4) Epoch 3, batch 4600, loss[loss=0.2928, simple_loss=0.3507, pruned_loss=0.1175, over 23647.00 frames. ], tot_loss[loss=0.2795, simple_loss=0.329, pruned_loss=0.115, over 4692763.50 frames. ], batch size: 85, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:17:15,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:17:17,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:20,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:17:23,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:17:23,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:17:23,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:24,710 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.433e+02 2.837e+02 3.221e+02 4.908e+02, threshold=5.674e+02, percent-clipped=0.0 2023-09-28 18:17:24,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 18:17:27,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:17:32,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:17:34,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:37,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:42,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 18:17:43,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:44,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=101560.0, ans=0.0 2023-09-28 18:17:45,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:47,504 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.89 vs. limit=15.0 2023-09-28 18:17:49,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:17:49,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:49,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=101626.66666666667, ans=0.2 2023-09-28 18:17:55,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 18:17:55,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:17:55,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:02,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:03,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:18:05,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:18:09,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 18:18:10,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:18:15,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:16,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:18:18,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:18,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 18:18:18,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:19,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 18:18:20,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:20,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:21,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:23,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:18:25,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:25,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 18:18:25,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 18:18:26,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 18:18:26,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:28,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:29,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:29,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:38,787 INFO [train.py:1039] (2/4) Epoch 3, batch 4650, loss[loss=0.2771, simple_loss=0.3271, pruned_loss=0.1136, over 23327.00 frames. ], tot_loss[loss=0.2782, simple_loss=0.3277, pruned_loss=0.1144, over 4700766.45 frames. ], batch size: 93, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:18:41,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:18:45,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:45,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:46,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:18:47,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:47,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:47,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=101826.66666666667, ans=0.0 2023-09-28 18:18:48,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:52,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 18:18:56,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:18:58,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 18:18:58,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:19:00,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 18:19:00,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:19:01,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 18:19:01,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 18:19:02,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:02,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:19:05,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:19:07,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:07,464 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 18:19:07,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=101893.33333333333, ans=0.125 2023-09-28 18:19:10,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:12,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 18:19:16,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:16,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:19:16,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=101960.0, ans=0.1 2023-09-28 18:19:17,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 18:19:19,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:19:22,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:19:23,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.82 vs. limit=6.0 2023-09-28 18:19:25,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:30,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:34,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:34,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:35,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:19:38,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 18:19:40,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 18:19:41,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 18:19:41,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 18:19:43,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:19:47,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=102093.33333333333, ans=0.0 2023-09-28 18:19:52,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:19:52,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:19:52,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 18:19:52,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:53,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:53,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:19:56,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:19:57,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:19:57,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:57,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:20:00,508 INFO [train.py:1039] (2/4) Epoch 3, batch 4700, loss[loss=0.3414, simple_loss=0.3652, pruned_loss=0.1589, over 22804.00 frames. ], tot_loss[loss=0.2787, simple_loss=0.3286, pruned_loss=0.1144, over 4704432.17 frames. ], batch size: 322, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:20:03,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:05,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:20:05,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:20:05,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:20:06,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:20:06,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 18:20:10,641 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.707e+02 3.161e+02 3.958e+02 7.246e+02, threshold=6.322e+02, percent-clipped=4.0 2023-09-28 18:20:14,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:14,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:14,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:20:15,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:17,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:20:24,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 18:20:24,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 18:20:25,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:29,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:20:29,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:20:32,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:39,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:20:41,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:20:42,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:48,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 18:20:50,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:20:53,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:20:57,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 18:20:59,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:04,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:21:04,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 18:21:06,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:06,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:09,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:21:10,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:21:10,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 18:21:12,045 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 18:21:13,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:15,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 18:21:18,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:21,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 18:21:23,336 INFO [train.py:1039] (2/4) Epoch 3, batch 4750, loss[loss=0.2896, simple_loss=0.3355, pruned_loss=0.1218, over 23778.00 frames. ], tot_loss[loss=0.2793, simple_loss=0.3294, pruned_loss=0.1146, over 4715365.35 frames. ], batch size: 179, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:21:23,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:21:24,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:30,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:30,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:21:32,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=102493.33333333333, ans=0.0 2023-09-28 18:21:33,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 18:21:33,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:21:35,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 18:21:38,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:21:38,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:39,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:39,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=102560.0, ans=0.2 2023-09-28 18:21:45,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 18:21:50,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:21:52,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 18:21:52,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=102560.0, ans=0.125 2023-09-28 18:21:53,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:56,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=102626.66666666667, ans=0.1 2023-09-28 18:21:59,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:59,643 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 18:21:59,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 18:22:04,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 18:22:08,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:10,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:12,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=102693.33333333333, ans=0.125 2023-09-28 18:22:13,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:22:13,909 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 18:22:13,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:15,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:22:18,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:22:20,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 18:22:20,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 18:22:20,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:22:21,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:22:21,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:23,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:22:23,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 18:22:25,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 18:22:28,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:22:31,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:22:31,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 18:22:33,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:22:33,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:34,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:22:36,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:37,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:22:41,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:41,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 18:22:43,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 18:22:44,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 18:22:46,132 INFO [train.py:1039] (2/4) Epoch 3, batch 4800, loss[loss=0.2894, simple_loss=0.3278, pruned_loss=0.1255, over 23506.00 frames. ], tot_loss[loss=0.2816, simple_loss=0.3309, pruned_loss=0.1161, over 4712372.05 frames. ], batch size: 134, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:22:48,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:22:48,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:49,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 18:22:54,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=102826.66666666667, ans=0.04949747468305833 2023-09-28 18:22:55,852 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.876e+02 2.499e+02 2.983e+02 3.709e+02 7.262e+02, threshold=5.966e+02, percent-clipped=1.0 2023-09-28 18:22:55,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:56,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:02,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:23:04,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:04,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:04,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 18:23:05,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:23:07,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:23:07,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:23:13,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:13,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=102893.33333333333, ans=0.125 2023-09-28 18:23:15,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=102893.33333333333, ans=0.0 2023-09-28 18:23:17,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:23:17,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:23:17,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:19,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:22,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:22,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:23:26,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:23:27,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:29,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 18:23:29,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 18:23:32,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:32,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:23:32,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:23:32,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:34,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:23:34,787 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.03 vs. limit=22.5 2023-09-28 18:23:35,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:23:35,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:37,560 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:23:40,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:41,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:44,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:23:48,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 18:23:49,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:49,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:49,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:23:52,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:55,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:57,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:23:57,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:57,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:23:58,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:24:00,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:24:03,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:03,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:03,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:24:05,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 18:24:08,479 INFO [train.py:1039] (2/4) Epoch 3, batch 4850, loss[loss=0.2738, simple_loss=0.3371, pruned_loss=0.1053, over 24353.00 frames. ], tot_loss[loss=0.2814, simple_loss=0.331, pruned_loss=0.1159, over 4721908.25 frames. ], batch size: 74, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:24:08,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 18:24:08,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:08,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:10,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:10,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:12,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=103160.0, ans=0.125 2023-09-28 18:24:13,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:24:21,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 18:24:21,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:21,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=103160.0, ans=0.0 2023-09-28 18:24:26,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:28,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:24:28,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:31,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:33,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:24:35,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:24:35,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 18:24:39,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:42,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:24:42,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:24:42,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:24:42,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 18:24:42,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=103293.33333333333, ans=0.025 2023-09-28 18:24:46,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:46,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:49,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:49,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 18:24:51,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 18:24:51,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:24:52,159 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.60 vs. limit=6.0 2023-09-28 18:24:59,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:25:00,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 18:25:00,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:25:00,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:25:03,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:25:05,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 18:25:05,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:08,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 18:25:08,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:09,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:09,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=103360.0, ans=0.0 2023-09-28 18:25:11,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 18:25:21,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=103426.66666666667, ans=0.125 2023-09-28 18:25:22,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:28,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:25:28,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:29,921 INFO [train.py:1039] (2/4) Epoch 3, batch 4900, loss[loss=0.2773, simple_loss=0.3079, pruned_loss=0.1234, over 22628.00 frames. ], tot_loss[loss=0.2793, simple_loss=0.3296, pruned_loss=0.1145, over 4735873.86 frames. ], batch size: 322, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:25:35,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 18:25:35,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:25:40,618 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.465e+02 2.992e+02 4.302e+02 8.236e+02, threshold=5.984e+02, percent-clipped=6.0 2023-09-28 18:25:40,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:42,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:42,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:25:45,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 18:25:46,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=103560.0, ans=0.1 2023-09-28 18:25:50,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 18:25:54,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 18:25:55,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 18:25:55,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:25:57,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:57,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:25:57,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:57,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:25:57,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 18:26:00,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 18:26:01,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:26:03,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:26:04,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:26:05,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=15.0 2023-09-28 18:26:06,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:26:08,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:08,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:08,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 18:26:10,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:26:13,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:26:13,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 18:26:13,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 18:26:18,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 18:26:20,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:26:21,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:26:23,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:26:23,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:23,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:26:23,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:26:24,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 18:26:26,586 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:26:27,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:29,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.87 vs. limit=15.0 2023-09-28 18:26:30,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:26:30,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:26:34,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 18:26:34,877 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.82 vs. limit=22.5 2023-09-28 18:26:35,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:26:35,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:26:35,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 18:26:44,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:45,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:26:47,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 18:26:47,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:26:47,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:26:51,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:53,061 INFO [train.py:1039] (2/4) Epoch 3, batch 4950, loss[loss=0.3205, simple_loss=0.3538, pruned_loss=0.1437, over 23804.00 frames. ], tot_loss[loss=0.2786, simple_loss=0.329, pruned_loss=0.1141, over 4738896.11 frames. ], batch size: 164, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:26:54,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:26:54,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:26:56,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:56,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 18:26:57,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:27:00,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:00,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:27:02,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=103826.66666666667, ans=0.1 2023-09-28 18:27:04,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 18:27:04,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 18:27:05,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:27:05,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 18:27:05,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:06,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:27:06,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:27:07,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:08,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:10,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:27:10,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=103893.33333333333, ans=0.0 2023-09-28 18:27:11,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:27:13,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=103893.33333333333, ans=0.125 2023-09-28 18:27:14,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:14,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:14,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:27:19,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:27:24,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:25,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:27:27,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:27,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:28,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:27:30,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 18:27:30,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=103960.0, ans=0.125 2023-09-28 18:27:31,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 18:27:34,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:35,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=103960.0, ans=0.1 2023-09-28 18:27:36,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=103960.0, ans=0.0 2023-09-28 18:27:36,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=103960.0, ans=0.125 2023-09-28 18:27:36,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=103960.0, ans=0.125 2023-09-28 18:27:37,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:27:37,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:27:39,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:27:39,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:27:40,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:27:42,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:44,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:27:45,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:27:50,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:50,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:52,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 18:27:52,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:27:53,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:27:57,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:27:58,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:27:59,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:28:01,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:01,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:28:02,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:28:04,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:28:04,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:28:04,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:28:05,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 18:28:06,846 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.19 vs. limit=22.5 2023-09-28 18:28:09,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:09,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=104093.33333333333, ans=0.125 2023-09-28 18:28:13,303 INFO [train.py:1039] (2/4) Epoch 3, batch 5000, loss[loss=0.2819, simple_loss=0.3216, pruned_loss=0.1211, over 23741.00 frames. ], tot_loss[loss=0.2775, simple_loss=0.3281, pruned_loss=0.1134, over 4737885.01 frames. ], batch size: 232, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:28:15,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 18:28:15,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:28:22,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:23,505 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.857e+02 2.486e+02 2.809e+02 3.764e+02 5.780e+02, threshold=5.617e+02, percent-clipped=0.0 2023-09-28 18:28:23,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:25,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 18:28:25,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 18:28:26,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:28:29,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 18:28:29,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:28:29,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:28:31,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 18:28:31,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:33,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:28:33,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 18:28:33,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:34,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:28:35,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 18:28:35,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 18:28:36,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:28:38,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 18:28:38,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:28:38,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:39,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:28:39,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 18:28:39,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 18:28:41,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 18:28:41,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:41,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:44,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 18:28:44,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:45,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:46,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:48,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:28:50,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 18:28:50,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:28:52,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:28:57,389 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 18:28:59,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=104293.33333333333, ans=0.125 2023-09-28 18:29:00,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:29:02,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:29:02,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:04,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 18:29:04,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:29:04,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:06,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:07,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 18:29:09,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:11,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:13,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:19,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 18:29:23,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:33,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:34,922 INFO [train.py:1039] (2/4) Epoch 3, batch 5050, loss[loss=0.2582, simple_loss=0.3205, pruned_loss=0.09796, over 24459.00 frames. ], tot_loss[loss=0.2801, simple_loss=0.3297, pruned_loss=0.1152, over 4706152.68 frames. ], batch size: 66, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:29:35,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:35,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:29:35,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:35,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=104493.33333333333, ans=0.125 2023-09-28 18:29:36,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:29:36,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:29:37,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 18:29:42,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:29:42,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=104493.33333333333, ans=0.0 2023-09-28 18:29:45,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:47,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:29:48,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 18:29:48,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:50,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:50,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=104560.0, ans=0.125 2023-09-28 18:29:53,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:29:53,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:29:54,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:30:04,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 18:30:04,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:30:06,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:06,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 18:30:06,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:08,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:09,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:10,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:30:10,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 18:30:11,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 18:30:13,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:16,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:19,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:20,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 18:30:21,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:24,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 18:30:24,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=104693.33333333333, ans=0.1 2023-09-28 18:30:27,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:30:27,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:30:27,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:27,976 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:30:29,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:32,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:30:33,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:30:35,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:35,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:30:37,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:30:37,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 18:30:38,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:30:40,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:42,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=104760.0, ans=0.0 2023-09-28 18:30:42,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=104760.0, ans=0.125 2023-09-28 18:30:44,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:44,295 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 18:30:44,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:30:45,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:30:45,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:45,843 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 18:30:48,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:48,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 18:30:48,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:52,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:52,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:54,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 18:30:55,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 18:30:57,714 INFO [train.py:1039] (2/4) Epoch 3, batch 5100, loss[loss=0.3012, simple_loss=0.3385, pruned_loss=0.132, over 23876.00 frames. ], tot_loss[loss=0.2804, simple_loss=0.3304, pruned_loss=0.1152, over 4712494.14 frames. ], batch size: 195, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:30:57,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:57,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:30:58,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=104826.66666666667, ans=0.125 2023-09-28 18:30:59,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:31:02,338 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 18:31:03,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:31:06,670 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.935e+02 2.697e+02 3.242e+02 4.082e+02 8.790e+02, threshold=6.484e+02, percent-clipped=7.0 2023-09-28 18:31:06,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 18:31:08,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 18:31:09,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:12,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:31:15,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:31:16,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 18:31:16,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 18:31:20,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:31:21,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:31:25,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:27,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=104893.33333333333, ans=0.0 2023-09-28 18:31:28,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 18:31:28,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:31,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:31:31,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:31:32,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 18:31:37,096 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 18:31:37,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:37,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 18:31:37,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 18:31:41,029 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.72 vs. limit=15.0 2023-09-28 18:31:41,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:52,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:31:53,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 18:31:53,834 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 18:31:55,166 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 18:31:56,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 18:31:56,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:32:00,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 18:32:02,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=105093.33333333333, ans=0.125 2023-09-28 18:32:06,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 18:32:09,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:32:11,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:32:12,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 18:32:14,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:32:14,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 18:32:18,933 INFO [train.py:1039] (2/4) Epoch 3, batch 5150, loss[loss=0.2614, simple_loss=0.3066, pruned_loss=0.1082, over 23856.00 frames. ], tot_loss[loss=0.2807, simple_loss=0.3312, pruned_loss=0.1151, over 4715839.18 frames. ], batch size: 150, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:32:22,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:32:22,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:32:22,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:32:24,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:32:24,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:32:24,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:32:25,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 18:32:25,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 18:32:27,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 18:32:27,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:32:28,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 18:32:29,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:29,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:32:30,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:31,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=105160.0, ans=0.1 2023-09-28 18:32:32,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:37,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:32:37,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 18:32:40,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:40,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:32:41,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=105226.66666666667, ans=10.0 2023-09-28 18:32:42,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:32:42,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:32:42,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:32:43,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=105226.66666666667, ans=0.2 2023-09-28 18:32:44,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:32:44,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:32:44,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 18:32:46,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:32:46,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:32:49,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:32:51,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 18:32:53,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:32:59,789 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.30 vs. limit=22.5 2023-09-28 18:33:00,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:33:04,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 18:33:07,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:14,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:14,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:17,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:19,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:21,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 18:33:22,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=105360.0, ans=0.125 2023-09-28 18:33:25,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:33:25,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:33:27,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:33:30,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:30,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:32,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 18:33:36,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=105426.66666666667, ans=0.125 2023-09-28 18:33:37,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:39,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:33:42,268 INFO [train.py:1039] (2/4) Epoch 3, batch 5200, loss[loss=0.3889, simple_loss=0.4009, pruned_loss=0.1885, over 19796.00 frames. ], tot_loss[loss=0.2822, simple_loss=0.3318, pruned_loss=0.1163, over 4713873.01 frames. ], batch size: 389, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:33:42,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:42,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:33:42,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=105493.33333333333, ans=0.0 2023-09-28 18:33:43,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:33:43,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:33:43,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:33:43,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:33:45,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=105493.33333333333, ans=0.125 2023-09-28 18:33:47,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:33:49,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:33:52,547 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.485e+02 2.931e+02 3.472e+02 7.408e+02, threshold=5.863e+02, percent-clipped=1.0 2023-09-28 18:33:52,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:55,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 18:33:57,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:33:57,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:00,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:00,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:34:01,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:02,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 18:34:07,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:34:08,846 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.09 vs. limit=15.0 2023-09-28 18:34:09,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:12,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 18:34:13,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:34:13,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:34:15,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 18:34:15,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 18:34:19,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 18:34:21,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:21,692 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 18:34:21,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:21,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:23,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:34:23,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 18:34:24,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:34:25,301 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:34:27,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 18:34:30,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 18:34:30,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 18:34:35,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 18:34:36,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:34:41,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:34:43,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:34:43,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 18:34:43,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:43,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 18:34:43,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:45,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:34:48,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:34:50,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:34:54,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:55,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:34:55,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:02,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:02,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=105760.0, ans=0.125 2023-09-28 18:35:03,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 18:35:04,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:35:05,432 INFO [train.py:1039] (2/4) Epoch 3, batch 5250, loss[loss=0.2838, simple_loss=0.32, pruned_loss=0.1238, over 23653.00 frames. ], tot_loss[loss=0.2808, simple_loss=0.3302, pruned_loss=0.1157, over 4717636.00 frames. ], batch size: 149, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:35:05,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:35:05,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:07,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:35:08,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:35:12,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:35:14,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:14,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:35:15,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:35:21,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:24,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:35:27,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:35:30,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:35:32,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 18:35:32,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:32,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:41,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.44 vs. limit=15.0 2023-09-28 18:35:49,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=105960.0, ans=0.0 2023-09-28 18:35:57,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.59 vs. limit=12.0 2023-09-28 18:35:59,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=106026.66666666667, ans=0.125 2023-09-28 18:36:20,359 INFO [train.py:1039] (2/4) Epoch 3, batch 5300, loss[loss=0.2342, simple_loss=0.2967, pruned_loss=0.08587, over 24637.00 frames. ], tot_loss[loss=0.2789, simple_loss=0.3285, pruned_loss=0.1146, over 4729484.09 frames. ], batch size: 60, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:36:28,664 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 2.515e+02 2.948e+02 3.617e+02 7.012e+02, threshold=5.895e+02, percent-clipped=2.0 2023-09-28 18:36:35,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:36:35,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 18:36:35,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 18:36:35,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:36,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:36,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:36,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:36,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:36,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:36:36,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:36,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:36:37,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:36:37,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 18:36:37,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 18:36:37,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 18:36:37,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:36:37,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 18:36:37,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 18:36:37,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:38,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:38,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:38,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:39,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:36:39,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:39,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:39,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:39,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:39,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:39,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:36:39,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:39,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:36:40,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 18:36:40,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:41,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:41,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 18:36:41,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 18:36:41,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:36:41,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:36:41,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 18:36:41,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 18:36:42,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:43,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:36:43,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:43,495 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 18:36:43,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 18:36:43,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:36:43,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:43,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 18:36:43,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 18:36:44,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 18:36:44,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:54,511 INFO [train.py:1039] (2/4) Epoch 4, batch 0, loss[loss=0.2705, simple_loss=0.3384, pruned_loss=0.1013, over 24290.00 frames. ], tot_loss[loss=0.2705, simple_loss=0.3384, pruned_loss=0.1013, over 24290.00 frames. ], batch size: 74, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:36:54,512 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 18:37:09,542 INFO [train.py:1071] (2/4) Epoch 4, validation: loss=0.3856, simple_loss=0.3373, pruned_loss=0.217, over 1125622.00 frames. 2023-09-28 18:37:09,543 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 18:37:12,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 18:37:14,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:37:15,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:37:21,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:21,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:37:22,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:24,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 18:37:25,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 18:37:27,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:27,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:34,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:37:34,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:36,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 18:37:39,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:46,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:37:46,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:48,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 18:37:51,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=106373.33333333333, ans=0.125 2023-09-28 18:37:51,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=106373.33333333333, ans=0.2 2023-09-28 18:37:52,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:37:52,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:37:54,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:37:57,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=106440.0, ans=0.125 2023-09-28 18:37:58,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:38:00,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=106440.0, ans=0.125 2023-09-28 18:38:01,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:08,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 18:38:10,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=106440.0, ans=0.0 2023-09-28 18:38:13,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 18:38:13,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:13,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:15,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:38:15,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:38:16,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 18:38:20,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:22,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:24,110 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:38:25,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:38:28,687 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 18:38:28,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=106573.33333333333, ans=0.125 2023-09-28 18:38:30,524 INFO [train.py:1039] (2/4) Epoch 4, batch 50, loss[loss=0.2894, simple_loss=0.3514, pruned_loss=0.1137, over 24537.00 frames. ], tot_loss[loss=0.2778, simple_loss=0.3303, pruned_loss=0.1126, over 1068160.18 frames. ], batch size: 71, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:38:32,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:38:33,633 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.65 vs. limit=15.0 2023-09-28 18:38:34,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:37,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:37,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 18:38:39,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:38:39,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:38:40,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:42,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:44,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:48,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 18:38:48,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:57,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=106640.0, ans=0.0 2023-09-28 18:38:58,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:39:00,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 18:39:02,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 18:39:05,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:39:06,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:07,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:09,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:09,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:39:11,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:39:11,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:16,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:19,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:19,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:39:19,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 18:39:20,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:39:22,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:39:22,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 18:39:23,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:24,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 18:39:25,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=106773.33333333333, ans=0.125 2023-09-28 18:39:29,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.21 vs. limit=6.0 2023-09-28 18:39:30,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:39:30,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:33,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:35,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:35,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:37,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 18:39:37,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 18:39:37,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:39,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:41,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:41,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:43,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 18:39:43,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 18:39:44,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:39:46,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:46,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:39:47,464 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 2.007e+02 2.562e+02 2.907e+02 3.580e+02 6.238e+02, threshold=5.814e+02, percent-clipped=1.0 2023-09-28 18:39:47,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 18:39:47,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 18:39:49,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:49,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:50,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:39:50,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:39:55,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:39:56,646 INFO [train.py:1039] (2/4) Epoch 4, batch 100, loss[loss=0.2508, simple_loss=0.3151, pruned_loss=0.09326, over 24349.00 frames. ], tot_loss[loss=0.2774, simple_loss=0.33, pruned_loss=0.1125, over 1881062.86 frames. ], batch size: 61, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:39:58,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:40:01,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:05,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 18:40:05,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:40:08,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:40:08,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:08,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:40:08,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:40:10,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:10,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=106906.66666666667, ans=0.125 2023-09-28 18:40:11,248 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.18 vs. limit=10.0 2023-09-28 18:40:11,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 18:40:15,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:40:15,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:15,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:17,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:21,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 18:40:21,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:22,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:22,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:40:24,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:40:29,217 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 18:40:29,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 18:40:30,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:40:30,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:40:35,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:40:35,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:37,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.67 vs. limit=22.5 2023-09-28 18:40:39,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:43,616 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=12.0 2023-09-28 18:40:44,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=107040.0, ans=0.1 2023-09-28 18:40:47,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:47,623 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 18:40:49,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:40:54,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:40:55,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:00,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:02,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:05,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:06,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:41:09,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:10,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:10,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=107173.33333333333, ans=0.05 2023-09-28 18:41:11,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:11,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:41:11,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:13,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 18:41:13,097 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 18:41:13,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:14,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:41:14,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:14,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:14,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:41:16,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:41:16,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:41:16,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:16,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:17,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=107173.33333333333, ans=0.1 2023-09-28 18:41:18,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:18,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:41:18,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:41:19,793 INFO [train.py:1039] (2/4) Epoch 4, batch 150, loss[loss=0.2691, simple_loss=0.3143, pruned_loss=0.112, over 23814.00 frames. ], tot_loss[loss=0.2777, simple_loss=0.3294, pruned_loss=0.1131, over 2500351.34 frames. ], batch size: 164, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:41:21,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:22,299 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.39 vs. limit=15.0 2023-09-28 18:41:24,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:24,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:41:24,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:28,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:29,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:31,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:41:33,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:39,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 18:41:39,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 18:41:39,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 18:41:42,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:41:42,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:41:43,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:44,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=107306.66666666667, ans=0.125 2023-09-28 18:41:45,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:45,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:45,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:48,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:49,737 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 18:41:51,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:55,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=107373.33333333333, ans=0.2 2023-09-28 18:41:58,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:01,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=107373.33333333333, ans=0.1 2023-09-28 18:42:03,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:42:03,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 18:42:07,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=107440.0, ans=0.125 2023-09-28 18:42:09,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:42:09,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:09,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:11,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:42:13,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:42:16,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:42:17,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:18,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 18:42:22,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:24,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:24,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:42:24,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:42:27,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:29,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 18:42:31,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.05 vs. limit=15.0 2023-09-28 18:42:31,543 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.741e+02 2.491e+02 2.943e+02 3.333e+02 6.261e+02, threshold=5.886e+02, percent-clipped=1.0 2023-09-28 18:42:33,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:42:34,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:42:34,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:36,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:42:36,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 18:42:36,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:36,626 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 18:42:40,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:42:41,583 INFO [train.py:1039] (2/4) Epoch 4, batch 200, loss[loss=0.2341, simple_loss=0.2988, pruned_loss=0.08473, over 24579.00 frames. ], tot_loss[loss=0.2761, simple_loss=0.3281, pruned_loss=0.112, over 3005953.05 frames. ], batch size: 60, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:42:44,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:42:44,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:42:48,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 18:42:49,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:50,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:53,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 18:42:53,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=107573.33333333333, ans=0.0 2023-09-28 18:42:56,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:42:56,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:57,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:58,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=107640.0, ans=0.0 2023-09-28 18:42:59,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=107640.0, ans=0.05 2023-09-28 18:43:00,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:43:02,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:43:02,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:19,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:43:19,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:43:21,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:43:22,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=107706.66666666667, ans=0.0 2023-09-28 18:43:23,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:43:23,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 18:43:23,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:43:24,280 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.27 vs. limit=15.0 2023-09-28 18:43:26,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:26,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:43:26,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:28,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:43:28,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 18:43:29,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:43:29,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:33,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:43:43,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:51,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:53,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:43:57,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:01,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 18:44:01,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:01,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:44:01,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:01,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=107840.0, ans=0.125 2023-09-28 18:44:02,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:44:04,260 INFO [train.py:1039] (2/4) Epoch 4, batch 250, loss[loss=0.2703, simple_loss=0.2973, pruned_loss=0.1217, over 22703.00 frames. ], tot_loss[loss=0.2753, simple_loss=0.3267, pruned_loss=0.1119, over 3388428.28 frames. ], batch size: 322, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:44:04,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 18:44:04,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:44:04,546 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 18:44:07,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:11,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:44:13,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:13,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:14,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:44:14,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:16,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:44:20,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:44:34,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:36,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:38,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:44:39,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=108040.0, ans=0.125 2023-09-28 18:44:41,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=108040.0, ans=0.125 2023-09-28 18:44:45,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:44:45,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:44:47,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:44:47,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:47,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:44:47,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:44:47,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:51,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:44:52,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 18:44:52,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:54,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:44:56,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:44:56,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:44:57,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:44:57,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:44:59,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:45:00,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.57 vs. limit=22.5 2023-09-28 18:45:00,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:02,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:45:03,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:05,607 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:45:07,116 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:45:08,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:45:12,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:13,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:45:17,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:19,088 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.821e+02 2.378e+02 2.704e+02 3.177e+02 4.711e+02, threshold=5.407e+02, percent-clipped=0.0 2023-09-28 18:45:19,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:45:23,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 18:45:25,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:45:25,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:45:25,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 18:45:25,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:45:26,852 INFO [train.py:1039] (2/4) Epoch 4, batch 300, loss[loss=0.2838, simple_loss=0.3383, pruned_loss=0.1147, over 23754.00 frames. ], tot_loss[loss=0.2742, simple_loss=0.3256, pruned_loss=0.1114, over 3676325.65 frames. ], batch size: 85, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:45:27,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:45:27,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 18:45:32,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:32,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:45:35,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=108240.0, ans=0.125 2023-09-28 18:45:37,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=26.06 vs. limit=22.5 2023-09-28 18:45:38,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:45:38,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 18:45:40,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:41,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:45:41,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 18:45:41,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:45:42,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.20 vs. limit=22.5 2023-09-28 18:45:47,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:45:52,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:45:52,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 18:45:54,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.06 vs. limit=15.0 2023-09-28 18:45:57,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 18:45:58,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:00,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:01,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:01,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 18:46:01,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:46:04,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:46:07,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:46:07,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:11,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:46:11,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 18:46:13,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:46:16,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:18,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 18:46:18,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:23,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:46:27,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:46:27,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 18:46:30,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:30,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:46:33,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:35,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:46:35,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 18:46:35,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:46:38,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:39,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 18:46:40,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.63 vs. limit=15.0 2023-09-28 18:46:41,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:41,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:43,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:44,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:44,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:49,321 INFO [train.py:1039] (2/4) Epoch 4, batch 350, loss[loss=0.308, simple_loss=0.3394, pruned_loss=0.1383, over 23789.00 frames. ], tot_loss[loss=0.2725, simple_loss=0.3232, pruned_loss=0.1109, over 3882057.51 frames. ], batch size: 164, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:46:49,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:46:49,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:46:53,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:01,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:47:04,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:04,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:07,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 18:47:09,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:10,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 18:47:12,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:12,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 18:47:14,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:17,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 18:47:18,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:47:21,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:22,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:47:24,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:24,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:25,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:47:25,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:47:25,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:27,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=108706.66666666667, ans=0.0 2023-09-28 18:47:35,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:47:36,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:47:36,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:47:36,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:41,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 18:47:41,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:46,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:47,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:47:47,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:49,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 18:47:50,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:47:51,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.38 vs. limit=15.0 2023-09-28 18:47:52,376 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 18:47:53,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 18:47:53,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:57,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:57,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 18:48:00,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:03,921 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.316e+02 2.681e+02 3.192e+02 4.934e+02, threshold=5.363e+02, percent-clipped=0.0 2023-09-28 18:48:04,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:48:06,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:07,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:07,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:09,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:12,984 INFO [train.py:1039] (2/4) Epoch 4, batch 400, loss[loss=0.2626, simple_loss=0.3155, pruned_loss=0.1048, over 23556.00 frames. ], tot_loss[loss=0.2714, simple_loss=0.3225, pruned_loss=0.1102, over 4064609.18 frames. ], batch size: 149, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:48:13,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:48:16,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:48:16,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 18:48:17,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:17,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:20,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:48:21,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:22,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:25,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:28,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 18:48:30,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 18:48:30,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:32,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 18:48:32,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:37,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:48:37,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:39,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 18:48:39,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:48:40,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:40,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:40,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:43,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 18:48:44,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 18:48:50,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:51,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:51,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 18:48:53,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 18:48:56,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:48:59,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:05,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 18:49:09,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:49:10,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 18:49:13,032 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.42 vs. limit=15.0 2023-09-28 18:49:14,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:49:14,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:49:15,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 18:49:20,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:49:23,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:49:23,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:49:26,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:26,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 18:49:29,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:49:30,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 18:49:33,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:49:34,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:49:34,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 18:49:35,756 INFO [train.py:1039] (2/4) Epoch 4, batch 450, loss[loss=0.2895, simple_loss=0.3335, pruned_loss=0.1227, over 23595.00 frames. ], tot_loss[loss=0.2709, simple_loss=0.3227, pruned_loss=0.1095, over 4220695.54 frames. ], batch size: 149, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:49:36,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:49:37,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:49:37,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:49:39,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 18:49:40,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:49:40,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:49:40,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:49:43,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 18:49:43,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:49:44,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:49:46,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:49:55,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:56,501 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.65 vs. limit=15.0 2023-09-28 18:49:57,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:49:58,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 18:49:58,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 18:49:59,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.16 vs. limit=15.0 2023-09-28 18:50:00,772 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=2.484e-02 2023-09-28 18:50:03,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:50:04,155 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.85 vs. limit=15.0 2023-09-28 18:50:06,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:08,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:11,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:12,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:16,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 18:50:18,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 18:50:20,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 18:50:22,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:50:23,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:23,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:50:23,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=109373.33333333333, ans=0.0 2023-09-28 18:50:25,626 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 18:50:25,640 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 18:50:25,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:27,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:50:27,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:50:31,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:50:31,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:50:31,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 18:50:32,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 18:50:35,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:37,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:50:37,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:50:38,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 18:50:42,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=109506.66666666667, ans=0.125 2023-09-28 18:50:43,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:50:45,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 18:50:45,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 18:50:46,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:51,107 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.294e+02 2.615e+02 3.130e+02 6.732e+02, threshold=5.230e+02, percent-clipped=1.0 2023-09-28 18:50:53,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:50:55,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:50:56,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.04 vs. limit=6.0 2023-09-28 18:50:56,524 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=17.38 vs. limit=22.5 2023-09-28 18:50:58,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:50:58,565 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 18:50:59,901 INFO [train.py:1039] (2/4) Epoch 4, batch 500, loss[loss=0.2546, simple_loss=0.3216, pruned_loss=0.09381, over 24460.00 frames. ], tot_loss[loss=0.272, simple_loss=0.3242, pruned_loss=0.1099, over 4337093.41 frames. ], batch size: 69, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:51:02,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:04,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:51:04,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:04,291 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 18:51:07,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 18:51:07,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:07,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=109573.33333333333, ans=0.04949747468305833 2023-09-28 18:51:09,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=109573.33333333333, ans=0.0 2023-09-28 18:51:10,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:51:14,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:51:17,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:51:19,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:51:19,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:19,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:31,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:31,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:51:31,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:51:33,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:33,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 18:51:33,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:51:36,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:51:36,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:51:38,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:51:38,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:40,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 18:51:43,339 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 18:51:45,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:51:45,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=109706.66666666667, ans=0.0 2023-09-28 18:51:47,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:49,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:51:51,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 18:51:55,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:51:55,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:01,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:03,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:52:10,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:12,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 18:52:14,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:14,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:17,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 18:52:17,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=109840.0, ans=0.07 2023-09-28 18:52:19,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:52:20,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:22,110 INFO [train.py:1039] (2/4) Epoch 4, batch 550, loss[loss=0.2584, simple_loss=0.3237, pruned_loss=0.09657, over 24471.00 frames. ], tot_loss[loss=0.2731, simple_loss=0.3259, pruned_loss=0.1101, over 4429115.43 frames. ], batch size: 66, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:52:25,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 18:52:26,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 18:52:26,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:26,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 18:52:28,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:52:28,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:28,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:28,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:29,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:52:29,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:52:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:34,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 18:52:34,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:52:38,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:52:38,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:39,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=109973.33333333333, ans=0.125 2023-09-28 18:52:40,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:52:41,240 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.39 vs. limit=15.0 2023-09-28 18:52:43,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:46,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=109973.33333333333, ans=0.125 2023-09-28 18:52:47,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 18:52:48,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 18:52:51,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:52:58,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:52:58,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:52:59,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:53:01,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=110040.0, ans=0.1 2023-09-28 18:53:04,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:04,282 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 18:53:04,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:53:05,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 18:53:05,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=110040.0, ans=0.125 2023-09-28 18:53:07,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=110040.0, ans=0.09899494936611666 2023-09-28 18:53:10,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:53:11,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:53:11,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:53:11,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:14,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 18:53:15,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 18:53:15,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:15,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:53:15,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:53:15,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:53:21,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:53:22,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:53:25,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:53:25,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:28,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:53:28,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:53:28,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=110173.33333333333, ans=0.125 2023-09-28 18:53:29,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:31,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:53:31,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:32,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:53:32,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:53:35,851 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.907e+02 2.501e+02 3.093e+02 3.785e+02 7.626e+02, threshold=6.186e+02, percent-clipped=7.0 2023-09-28 18:53:37,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 18:53:41,433 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.98 vs. limit=15.0 2023-09-28 18:53:43,659 INFO [train.py:1039] (2/4) Epoch 4, batch 600, loss[loss=0.2782, simple_loss=0.3424, pruned_loss=0.107, over 24300.00 frames. ], tot_loss[loss=0.2738, simple_loss=0.3264, pruned_loss=0.1106, over 4487994.79 frames. ], batch size: 77, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:53:43,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 18:53:43,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:53:45,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:53:45,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:53,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:53:55,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:53:57,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 18:53:58,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:53:59,176 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=8.33 vs. limit=12.0 2023-09-28 18:54:01,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:04,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:06,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 18:54:06,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:54:09,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=110306.66666666667, ans=0.125 2023-09-28 18:54:11,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.22 vs. limit=8.0 2023-09-28 18:54:12,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 18:54:18,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:54:18,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:18,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:54:21,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=110373.33333333333, ans=0.125 2023-09-28 18:54:25,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:54:25,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:54:26,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:29,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=110373.33333333333, ans=0.125 2023-09-28 18:54:31,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=110373.33333333333, ans=0.0 2023-09-28 18:54:33,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:54:35,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=110440.0, ans=0.125 2023-09-28 18:54:38,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:38,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:38,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:39,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.25 vs. limit=15.0 2023-09-28 18:54:44,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 18:54:50,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:54:50,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:54:51,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=110506.66666666667, ans=0.125 2023-09-28 18:54:54,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 18:54:56,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:55:00,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 18:55:00,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:55:00,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:55:01,014 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.41 vs. limit=10.0 2023-09-28 18:55:06,796 INFO [train.py:1039] (2/4) Epoch 4, batch 650, loss[loss=0.2644, simple_loss=0.3257, pruned_loss=0.1016, over 24648.00 frames. ], tot_loss[loss=0.2725, simple_loss=0.3248, pruned_loss=0.1101, over 4521160.88 frames. ], batch size: 65, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:55:06,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:55:07,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:55:09,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:12,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:55:15,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:17,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 18:55:17,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:55:23,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:55:23,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:25,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=110640.0, ans=0.2 2023-09-28 18:55:29,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:30,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 18:55:34,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:55:34,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:38,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:55:38,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 18:55:41,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:42,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:42,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:55:43,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:44,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:55:46,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:55:46,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 18:55:46,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:46,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:55:49,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:50,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:52,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:55:52,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:55:54,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 18:55:54,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:55:55,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:57,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:55:57,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:59,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:56:02,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 18:56:03,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 18:56:05,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:05,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:56:05,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:56:05,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:56:08,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:56:15,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:15,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:15,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:56:18,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:18,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 18:56:20,136 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.763e+02 2.387e+02 2.700e+02 3.231e+02 6.128e+02, threshold=5.400e+02, percent-clipped=0.0 2023-09-28 18:56:20,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:26,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:56:26,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:27,734 INFO [train.py:1039] (2/4) Epoch 4, batch 700, loss[loss=0.3014, simple_loss=0.3586, pruned_loss=0.1221, over 23946.00 frames. ], tot_loss[loss=0.2703, simple_loss=0.3232, pruned_loss=0.1087, over 4558444.00 frames. ], batch size: 86, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:56:27,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:27,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:32,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 18:56:34,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 18:56:36,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 18:56:36,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:40,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:56:41,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 18:56:45,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:45,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=110973.33333333333, ans=0.1 2023-09-28 18:56:48,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:56:48,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:49,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=110973.33333333333, ans=0.1 2023-09-28 18:56:50,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:56:51,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:53,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:56,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:56:56,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:56:58,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 18:57:00,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 18:57:05,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:57:05,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:57:08,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:57:11,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:57:13,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 18:57:18,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:19,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:57:19,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 18:57:22,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.74 vs. limit=22.5 2023-09-28 18:57:24,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:57:26,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:28,421 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.52 vs. limit=22.5 2023-09-28 18:57:30,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:57:31,292 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.74 vs. limit=15.0 2023-09-28 18:57:36,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:57:36,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 18:57:40,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 18:57:40,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 18:57:43,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:45,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:57:46,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:57:50,923 INFO [train.py:1039] (2/4) Epoch 4, batch 750, loss[loss=0.2942, simple_loss=0.3384, pruned_loss=0.125, over 23356.00 frames. ], tot_loss[loss=0.2693, simple_loss=0.3222, pruned_loss=0.1082, over 4583691.10 frames. ], batch size: 105, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:57:51,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:51,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 18:57:54,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 18:57:54,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 18:57:55,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=111240.0, ans=0.2 2023-09-28 18:57:55,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=111240.0, ans=0.1 2023-09-28 18:57:56,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 18:57:57,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 18:57:57,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 18:57:57,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:57:59,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 18:57:59,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:58:01,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:02,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:04,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:05,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:58:05,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:07,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:58:08,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:58:10,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:58:12,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:14,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:14,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 18:58:15,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:58:17,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:18,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:20,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:58:21,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 18:58:21,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:58:22,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=111373.33333333333, ans=0.0 2023-09-28 18:58:22,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=111373.33333333333, ans=0.0 2023-09-28 18:58:25,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 18:58:25,641 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 18:58:26,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=111373.33333333333, ans=15.0 2023-09-28 18:58:27,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 18:58:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:58:27,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:58:28,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:58:34,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:36,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:36,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:58:38,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:39,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:39,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 18:58:41,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:58:44,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:58:44,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:58:47,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:58:47,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 18:58:47,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:54,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:58:55,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:58:55,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:59,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:59:03,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 18:59:04,535 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.995e+02 2.483e+02 2.790e+02 3.186e+02 5.320e+02, threshold=5.579e+02, percent-clipped=0.0 2023-09-28 18:59:04,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:04,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:09,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:09,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:11,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=111573.33333333333, ans=0.1 2023-09-28 18:59:12,122 INFO [train.py:1039] (2/4) Epoch 4, batch 800, loss[loss=0.2883, simple_loss=0.3336, pruned_loss=0.1215, over 23553.00 frames. ], tot_loss[loss=0.2699, simple_loss=0.3227, pruned_loss=0.1086, over 4607881.37 frames. ], batch size: 135, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:59:12,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:12,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:59:20,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:20,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:23,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:23,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:23,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:24,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:25,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=111573.33333333333, ans=0.125 2023-09-28 18:59:26,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:30,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=111640.0, ans=0.2 2023-09-28 18:59:31,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:31,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:59:35,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 18:59:37,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:38,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:39,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:59:39,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:39,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 18:59:39,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:39,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 18:59:42,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:45,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:46,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:46,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:50,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:50,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:56,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:59:56,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:59:56,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:59:58,517 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 18:59:59,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 18:59:59,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:59:59,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:01,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:02,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:08,926 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 19:00:09,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 19:00:10,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:00:13,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:00:18,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:00:21,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:23,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 19:00:23,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:00:26,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 19:00:29,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=111840.0, ans=0.0 2023-09-28 19:00:34,293 INFO [train.py:1039] (2/4) Epoch 4, batch 850, loss[loss=0.2751, simple_loss=0.3375, pruned_loss=0.1063, over 24363.00 frames. ], tot_loss[loss=0.2697, simple_loss=0.3229, pruned_loss=0.1083, over 4639385.74 frames. ], batch size: 77, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:00:34,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:36,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:00:36,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 19:00:36,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=111906.66666666667, ans=0.125 2023-09-28 19:00:37,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:00:37,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:38,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.81 vs. limit=15.0 2023-09-28 19:00:39,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 19:00:40,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:40,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:00:42,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:44,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:00:46,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:48,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 19:00:48,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 19:00:48,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 19:00:49,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:49,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:00:51,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=111973.33333333333, ans=0.2 2023-09-28 19:00:52,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:52,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:52,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:00:56,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=111973.33333333333, ans=0.1 2023-09-28 19:00:58,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:59,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:59,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 19:01:01,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 19:01:04,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:01:06,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 19:01:10,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 19:01:12,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 19:01:12,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=112040.0, ans=0.125 2023-09-28 19:01:14,066 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 19:01:16,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:16,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:01:16,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:01:17,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:21,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:21,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 19:01:24,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:24,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:25,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:01:25,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:01:28,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:01:30,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:01:30,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 19:01:35,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:01:35,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:35,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:01:35,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:37,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:38,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:42,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:01:43,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:01:45,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:01:46,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:01:47,932 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.846e+02 2.332e+02 2.611e+02 3.097e+02 5.192e+02, threshold=5.223e+02, percent-clipped=0.0 2023-09-28 19:01:52,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:01:54,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:56,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 19:01:56,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:01:56,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:57,610 INFO [train.py:1039] (2/4) Epoch 4, batch 900, loss[loss=0.3796, simple_loss=0.3986, pruned_loss=0.1803, over 19619.00 frames. ], tot_loss[loss=0.2707, simple_loss=0.324, pruned_loss=0.1087, over 4669456.21 frames. ], batch size: 389, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:01:59,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 19:02:02,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=112240.0, ans=0.0 2023-09-28 19:02:06,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:02:08,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=112240.0, ans=0.125 2023-09-28 19:02:09,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:09,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 19:02:11,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=112306.66666666667, ans=0.1 2023-09-28 19:02:13,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:02:13,941 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:02:15,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 19:02:15,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:02:16,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:02:16,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:16,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:02:18,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:02:28,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=112373.33333333333, ans=0.125 2023-09-28 19:02:29,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:02:29,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:29,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:02:33,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:38,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 19:02:40,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:02:44,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:02:46,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:02:46,794 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 19:02:48,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 19:02:53,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=112440.0, ans=0.125 2023-09-28 19:02:54,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:02:54,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:02:55,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:03:01,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:01,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:05,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 19:03:05,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:03:08,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 19:03:10,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:03:10,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:11,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=112506.66666666667, ans=0.2 2023-09-28 19:03:13,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:03:13,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:16,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 19:03:16,674 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 19:03:18,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:03:18,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 19:03:20,103 INFO [train.py:1039] (2/4) Epoch 4, batch 950, loss[loss=0.2768, simple_loss=0.3369, pruned_loss=0.1084, over 24416.00 frames. ], tot_loss[loss=0.2709, simple_loss=0.3244, pruned_loss=0.1087, over 4676489.34 frames. ], batch size: 66, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:03:21,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:24,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=112573.33333333333, ans=0.2 2023-09-28 19:03:26,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 19:03:26,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=112573.33333333333, ans=0.125 2023-09-28 19:03:29,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:33,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:33,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:34,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:03:37,201 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 19:03:41,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:41,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:03:41,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:43,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:03:43,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 19:03:45,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:03:46,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:47,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 19:03:48,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:53,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:55,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 19:03:56,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:03:59,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:04:00,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=112706.66666666667, ans=0.1 2023-09-28 19:04:01,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:04:05,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:06,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:04:09,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 19:04:10,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:04:10,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:04:12,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:12,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:12,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:04:17,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 19:04:18,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:04:21,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:23,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:23,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 19:04:24,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:24,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:04:26,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 19:04:30,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:04:33,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:34,680 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.515e+02 2.858e+02 3.350e+02 4.786e+02, threshold=5.716e+02, percent-clipped=0.0 2023-09-28 19:04:37,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:38,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 19:04:38,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 19:04:42,998 INFO [train.py:1039] (2/4) Epoch 4, batch 1000, loss[loss=0.2678, simple_loss=0.3027, pruned_loss=0.1164, over 23407.00 frames. ], tot_loss[loss=0.2703, simple_loss=0.3232, pruned_loss=0.1087, over 4688092.22 frames. ], batch size: 285, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:04:43,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:46,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 19:04:46,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:51,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:04:53,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 19:04:53,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 19:04:59,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:04:59,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:59,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:00,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=112973.33333333333, ans=6.0 2023-09-28 19:05:02,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=112973.33333333333, ans=0.125 2023-09-28 19:05:04,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 19:05:08,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 19:05:10,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 19:05:10,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:11,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 19:05:13,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 19:05:13,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=112973.33333333333, ans=0.125 2023-09-28 19:05:14,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 19:05:16,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:18,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:25,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.55 vs. limit=15.0 2023-09-28 19:05:27,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:27,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:05:29,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:30,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:30,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 19:05:30,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:31,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:05:32,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:32,601 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 19:05:36,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 19:05:37,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 19:05:39,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 19:05:40,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:05:42,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=113106.66666666667, ans=0.125 2023-09-28 19:05:47,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:47,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:05:47,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:49,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:05:50,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 19:05:51,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=113173.33333333333, ans=0.0 2023-09-28 19:05:52,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:05:52,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 19:05:53,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 19:05:55,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:05:55,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:58,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:06:02,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:06:03,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.47 vs. limit=15.0 2023-09-28 19:06:04,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:06,321 INFO [train.py:1039] (2/4) Epoch 4, batch 1050, loss[loss=0.2521, simple_loss=0.3209, pruned_loss=0.09165, over 24666.00 frames. ], tot_loss[loss=0.2683, simple_loss=0.3214, pruned_loss=0.1076, over 4701063.15 frames. ], batch size: 65, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:06:08,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=113240.0, ans=0.0 2023-09-28 19:06:09,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:06:11,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:06:12,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:06:13,658 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.74 vs. limit=15.0 2023-09-28 19:06:14,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:15,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:16,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:06:18,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:06:21,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:06:22,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:06:22,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:06:24,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:06:24,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 19:06:25,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:26,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 19:06:29,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:06:29,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 19:06:29,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:06:38,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:38,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:06:40,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:41,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 19:06:41,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 19:06:42,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:45,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 19:06:45,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=113373.33333333333, ans=0.2 2023-09-28 19:06:48,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 19:06:48,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:51,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:06:54,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:06:55,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:06:55,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:06:58,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:07:02,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 19:07:02,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=113440.0, ans=0.0 2023-09-28 19:07:03,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 19:07:03,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 19:07:05,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:05,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:07:07,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 19:07:13,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:07:15,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:15,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:16,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:16,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 19:07:21,390 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.899e+02 2.368e+02 2.685e+02 3.530e+02 6.169e+02, threshold=5.370e+02, percent-clipped=1.0 2023-09-28 19:07:21,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=113506.66666666667, ans=0.125 2023-09-28 19:07:23,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:23,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 19:07:23,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 19:07:24,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:07:28,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:07:29,809 INFO [train.py:1039] (2/4) Epoch 4, batch 1100, loss[loss=0.2708, simple_loss=0.3367, pruned_loss=0.1025, over 24545.00 frames. ], tot_loss[loss=0.2673, simple_loss=0.3207, pruned_loss=0.1069, over 4713345.78 frames. ], batch size: 71, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:07:34,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:07:38,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:07:38,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:07:40,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:40,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 19:07:43,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:07:43,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=113573.33333333333, ans=0.125 2023-09-28 19:07:45,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:07:45,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.97 vs. limit=15.0 2023-09-28 19:07:47,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:07:51,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:07:51,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 19:07:53,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:07:54,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:56,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:57,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:07:57,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=113640.0, ans=0.125 2023-09-28 19:08:00,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:08:04,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:08:06,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 19:08:07,943 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 19:08:09,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:11,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:12,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:08:15,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:08:16,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 19:08:16,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:08:16,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:08:16,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:08:16,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:16,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 19:08:17,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=113706.66666666667, ans=0.0 2023-09-28 19:08:23,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:08:23,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 19:08:23,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=113773.33333333333, ans=0.125 2023-09-28 19:08:26,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:08:30,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:08:31,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=113773.33333333333, ans=0.125 2023-09-28 19:08:34,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 19:08:34,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:08:37,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:40,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:08:40,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:42,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 19:08:43,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:08:43,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:44,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 19:08:44,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:08:45,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 19:08:49,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:08:49,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:08:51,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:08:52,499 INFO [train.py:1039] (2/4) Epoch 4, batch 1150, loss[loss=0.3755, simple_loss=0.3843, pruned_loss=0.1833, over 19381.00 frames. ], tot_loss[loss=0.2675, simple_loss=0.3215, pruned_loss=0.1067, over 4706898.20 frames. ], batch size: 389, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:08:52,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=113906.66666666667, ans=0.0 2023-09-28 19:08:57,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:08:59,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:08:59,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=113906.66666666667, ans=0.2 2023-09-28 19:09:01,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:01,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:09:02,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 19:09:02,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:04,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=113906.66666666667, ans=0.125 2023-09-28 19:09:05,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 19:09:05,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:05,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:09:11,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 19:09:14,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:15,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=113973.33333333333, ans=0.125 2023-09-28 19:09:20,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:20,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:20,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 19:09:20,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:09:20,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:23,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 19:09:26,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:27,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:38,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:47,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:47,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 19:09:47,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:48,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:53,381 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 19:09:56,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:10:03,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 19:10:03,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=114173.33333333333, ans=0.0 2023-09-28 19:10:03,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=114173.33333333333, ans=0.125 2023-09-28 19:10:06,254 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.334e+02 2.773e+02 3.498e+02 6.141e+02, threshold=5.547e+02, percent-clipped=2.0 2023-09-28 19:10:06,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:08,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:10:08,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:10:08,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:10:11,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:13,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=114240.0, ans=0.0 2023-09-28 19:10:14,788 INFO [train.py:1039] (2/4) Epoch 4, batch 1200, loss[loss=0.289, simple_loss=0.3297, pruned_loss=0.1242, over 22802.00 frames. ], tot_loss[loss=0.2686, simple_loss=0.3224, pruned_loss=0.1075, over 4709760.53 frames. ], batch size: 322, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:10:15,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=114240.0, ans=0.0 2023-09-28 19:10:17,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:10:17,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:10:19,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:19,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:19,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:10:23,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:10:24,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:10:26,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:26,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:10:29,337 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 19:10:32,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 19:10:37,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:10:39,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:10:42,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:44,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:10:44,566 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 19:10:44,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:53,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:10:53,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:10:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 19:10:55,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:10:59,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 19:11:02,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 19:11:02,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:11:05,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:11:05,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=114440.0, ans=0.1 2023-09-28 19:11:06,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:06,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:11:08,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:11:08,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:11:08,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:11:10,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 19:11:10,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:11:12,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:12,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:11:15,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:15,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:20,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:11:22,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:11:23,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 19:11:27,134 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 19:11:29,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:11:32,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:34,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:11:35,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:37,074 INFO [train.py:1039] (2/4) Epoch 4, batch 1250, loss[loss=0.2845, simple_loss=0.3323, pruned_loss=0.1183, over 23784.00 frames. ], tot_loss[loss=0.2699, simple_loss=0.3236, pruned_loss=0.1081, over 4709574.83 frames. ], batch size: 212, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:11:38,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 19:11:39,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=114573.33333333333, ans=0.2 2023-09-28 19:11:42,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:11:44,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:44,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 19:11:48,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:11:49,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:11:54,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:11:54,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:55,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:11:55,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:11:58,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:12:04,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:12:04,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:12:04,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:06,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:12:06,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:09,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:10,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:12:14,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 19:12:15,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:12:19,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:19,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 19:12:21,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:12:21,475 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 19:12:21,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:21,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:23,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=114706.66666666667, ans=0.125 2023-09-28 19:12:23,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=114706.66666666667, ans=0.07 2023-09-28 19:12:24,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:27,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:12:30,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 19:12:30,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 19:12:32,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 19:12:34,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:12:36,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 19:12:37,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:40,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:12:40,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:12:42,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 19:12:42,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:12:42,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=114840.0, ans=0.125 2023-09-28 19:12:43,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:12:43,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:12:44,512 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.95 vs. limit=15.0 2023-09-28 19:12:45,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:48,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 19:12:49,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:51,346 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.852e+02 2.462e+02 2.704e+02 3.277e+02 4.911e+02, threshold=5.408e+02, percent-clipped=0.0 2023-09-28 19:12:52,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:12:53,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:12:57,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:13:00,819 INFO [train.py:1039] (2/4) Epoch 4, batch 1300, loss[loss=0.2721, simple_loss=0.3142, pruned_loss=0.115, over 22800.00 frames. ], tot_loss[loss=0.2701, simple_loss=0.3238, pruned_loss=0.1082, over 4716467.41 frames. ], batch size: 323, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:13:01,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:13:02,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 19:13:05,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:07,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:13:08,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:12,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:13:13,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:13:13,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 19:13:14,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=114906.66666666667, ans=0.1 2023-09-28 19:13:19,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:13:20,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:13:21,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 19:13:26,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:13:28,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=114973.33333333333, ans=0.1 2023-09-28 19:13:30,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:30,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:32,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:34,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:35,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:13:36,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=115040.0, ans=0.2 2023-09-28 19:13:37,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:13:37,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 19:13:43,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:13:43,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:13:45,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 19:13:47,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:13:48,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:13:51,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:53,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 19:13:53,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:53,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 19:13:54,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:59,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:59,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:14:02,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=115106.66666666667, ans=0.0 2023-09-28 19:14:04,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 19:14:04,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 19:14:06,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 19:14:08,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=115173.33333333333, ans=0.125 2023-09-28 19:14:09,150 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.87 vs. limit=22.5 2023-09-28 19:14:10,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=115173.33333333333, ans=0.1 2023-09-28 19:14:11,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:14:13,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 19:14:14,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:16,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.46 vs. limit=22.5 2023-09-28 19:14:22,655 INFO [train.py:1039] (2/4) Epoch 4, batch 1350, loss[loss=0.2678, simple_loss=0.3362, pruned_loss=0.09974, over 24293.00 frames. ], tot_loss[loss=0.2683, simple_loss=0.3225, pruned_loss=0.107, over 4728552.09 frames. ], batch size: 74, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:14:24,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 19:14:28,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:30,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:33,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:33,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:35,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:14:35,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:43,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:44,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 19:14:44,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:14:46,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:14:48,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 19:14:49,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:14:51,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:14:51,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 19:14:52,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 19:14:53,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=115306.66666666667, ans=0.1 2023-09-28 19:14:54,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 19:14:56,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:56,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 19:14:59,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=115373.33333333333, ans=0.125 2023-09-28 19:15:07,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:16,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:18,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:18,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 19:15:21,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:21,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 19:15:21,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:15:23,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:15:26,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:15:29,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 19:15:30,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:15:32,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=115506.66666666667, ans=0.125 2023-09-28 19:15:34,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=115506.66666666667, ans=0.2 2023-09-28 19:15:36,830 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.262e+02 2.530e+02 3.010e+02 4.866e+02, threshold=5.060e+02, percent-clipped=0.0 2023-09-28 19:15:38,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 19:15:40,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 19:15:45,547 INFO [train.py:1039] (2/4) Epoch 4, batch 1400, loss[loss=0.2639, simple_loss=0.3147, pruned_loss=0.1065, over 23583.00 frames. ], tot_loss[loss=0.2667, simple_loss=0.3204, pruned_loss=0.1065, over 4713713.82 frames. ], batch size: 149, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:15:45,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 19:15:45,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=115573.33333333333, ans=0.1 2023-09-28 19:15:48,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:52,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:15:52,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:15:56,639 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.15 vs. limit=10.0 2023-09-28 19:15:57,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 19:16:00,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 19:16:04,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=115640.0, ans=0.125 2023-09-28 19:16:08,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:16:10,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:12,349 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:16:13,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:16:13,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:16:17,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:16:20,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:16:28,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:29,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=115706.66666666667, ans=0.125 2023-09-28 19:16:30,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:31,029 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.70 vs. limit=15.0 2023-09-28 19:16:35,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 19:16:37,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:16:38,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:16:38,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:16:38,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:40,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:16:40,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:16:40,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:16:43,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 19:16:43,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:16:49,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:52,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:17:00,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 19:17:01,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:17:03,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:17:04,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=115840.0, ans=0.125 2023-09-28 19:17:06,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 19:17:06,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:07,664 INFO [train.py:1039] (2/4) Epoch 4, batch 1450, loss[loss=0.2745, simple_loss=0.3191, pruned_loss=0.1149, over 23832.00 frames. ], tot_loss[loss=0.2673, simple_loss=0.3207, pruned_loss=0.107, over 4721561.73 frames. ], batch size: 212, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:17:07,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:17:12,285 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.89 vs. limit=6.0 2023-09-28 19:17:12,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:17:15,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:17:15,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:15,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:17:20,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:20,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:17:22,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:17:22,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 19:17:22,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:17:23,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 19:17:24,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:25,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:25,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 19:17:27,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:29,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:17:29,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 19:17:31,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:31,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:17:33,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:34,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:38,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:17:38,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:17:39,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:41,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:42,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:42,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:17:43,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:45,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:17:49,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 19:17:51,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=116040.0, ans=0.125 2023-09-28 19:17:52,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:54,334 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 19:17:55,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:17:59,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:17:59,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:01,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 19:18:06,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:06,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 19:18:08,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 19:18:09,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:12,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:14,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:18:15,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 19:18:17,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 19:18:17,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 19:18:20,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:22,089 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.906e+02 2.300e+02 2.689e+02 3.268e+02 5.170e+02, threshold=5.379e+02, percent-clipped=2.0 2023-09-28 19:18:22,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:18:29,433 INFO [train.py:1039] (2/4) Epoch 4, batch 1500, loss[loss=0.257, simple_loss=0.3249, pruned_loss=0.09456, over 24075.00 frames. ], tot_loss[loss=0.2672, simple_loss=0.3213, pruned_loss=0.1065, over 4728527.56 frames. ], batch size: 80, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:18:34,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 19:18:34,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:18:34,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:18:36,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:38,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:38,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:18:40,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 19:18:41,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:18:41,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:18:41,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:43,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:44,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:18:47,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 19:18:54,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:18:54,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:18:56,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:57,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 19:19:02,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 19:19:04,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:19:04,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 19:19:05,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:19:09,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:09,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:19:09,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:19:11,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 19:19:11,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:19:13,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:13,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 19:19:14,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:21,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:19:21,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 19:19:28,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:19:29,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:19:31,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=116440.0, ans=0.125 2023-09-28 19:19:34,029 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 19:19:35,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:35,490 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 19:19:37,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:19:37,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:19:38,729 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 19:19:40,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:19:41,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 19:19:44,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:47,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:47,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:49,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:49,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:51,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:51,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 19:19:52,782 INFO [train.py:1039] (2/4) Epoch 4, batch 1550, loss[loss=0.264, simple_loss=0.3253, pruned_loss=0.1013, over 24461.00 frames. ], tot_loss[loss=0.2678, simple_loss=0.3216, pruned_loss=0.107, over 4726953.69 frames. ], batch size: 63, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:19:52,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 19:19:52,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:19:53,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 19:19:54,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 19:19:58,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:00,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:00,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:00,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:20:02,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:03,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:05,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=116573.33333333333, ans=0.0 2023-09-28 19:20:07,076 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 19:20:07,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:07,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:20:07,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:20:10,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:20:10,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 19:20:11,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:11,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 19:20:13,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 19:20:13,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 19:20:13,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:14,301 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.59 vs. limit=6.0 2023-09-28 19:20:15,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:21,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:20:22,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 19:20:22,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 19:20:32,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:36,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:36,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:20:36,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:20:36,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 19:20:36,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=116706.66666666667, ans=0.125 2023-09-28 19:20:41,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:20:42,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:45,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:20:46,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=116773.33333333333, ans=0.1 2023-09-28 19:20:48,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:20:48,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:48,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 19:20:50,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:20:51,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:20:52,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:53,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:20:53,978 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 19:20:56,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:02,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 19:21:06,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:07,906 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.992e+02 2.334e+02 2.588e+02 3.124e+02 7.530e+02, threshold=5.176e+02, percent-clipped=1.0 2023-09-28 19:21:08,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:21:09,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 19:21:11,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:21:12,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:12,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:21:12,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:21:14,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:21:15,416 INFO [train.py:1039] (2/4) Epoch 4, batch 1600, loss[loss=0.2361, simple_loss=0.3093, pruned_loss=0.08145, over 24671.00 frames. ], tot_loss[loss=0.2697, simple_loss=0.3229, pruned_loss=0.1083, over 4722789.94 frames. ], batch size: 68, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:21:18,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:18,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 19:21:20,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 19:21:20,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=116906.66666666667, ans=0.0 2023-09-28 19:21:22,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 19:21:23,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=116906.66666666667, ans=0.125 2023-09-28 19:21:24,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:21:26,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 19:21:28,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:21:29,204 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.42 vs. limit=22.5 2023-09-28 19:21:30,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:21:34,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:21:40,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 19:21:43,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:21:43,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 19:21:43,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:45,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 19:21:49,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 19:21:57,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:01,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 19:22:02,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:03,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:03,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:22:03,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=117106.66666666667, ans=0.2 2023-09-28 19:22:05,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 19:22:09,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:22:11,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:22:12,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:12,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:12,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:22:16,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:22:18,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:22:19,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:22:24,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:25,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:22:27,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=117173.33333333333, ans=0.0 2023-09-28 19:22:27,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=117173.33333333333, ans=0.95 2023-09-28 19:22:29,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 19:22:29,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:22:29,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 19:22:29,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=117173.33333333333, ans=0.1 2023-09-28 19:22:34,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:36,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=117173.33333333333, ans=0.125 2023-09-28 19:22:37,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:22:38,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:22:39,566 INFO [train.py:1039] (2/4) Epoch 4, batch 1650, loss[loss=0.2636, simple_loss=0.3033, pruned_loss=0.112, over 23847.00 frames. ], tot_loss[loss=0.2691, simple_loss=0.3232, pruned_loss=0.1075, over 4727716.25 frames. ], batch size: 195, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:22:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 19:22:39,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 19:22:39,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 19:22:39,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 19:22:44,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:45,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:45,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:22:47,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:22:50,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:53,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 19:22:55,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:55,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=117306.66666666667, ans=0.125 2023-09-28 19:22:57,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:57,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:22:57,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:22:57,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 19:22:57,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 19:23:04,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:23:07,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:23:09,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=117306.66666666667, ans=0.0 2023-09-28 19:23:18,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 19:23:19,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:20,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 19:23:23,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:26,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:23:26,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:23:26,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:26,695 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.37 vs. limit=15.0 2023-09-28 19:23:27,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:23:29,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:31,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:23:31,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:32,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:32,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:32,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:33,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:23:33,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=117440.0, ans=0.125 2023-09-28 19:23:33,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.48 vs. limit=15.0 2023-09-28 19:23:36,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:37,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 19:23:39,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:39,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 19:23:42,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 19:23:42,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 19:23:42,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:43,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:23:45,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:45,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:45,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 19:23:50,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:52,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=117506.66666666667, ans=0.0 2023-09-28 19:23:53,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.422e+02 2.762e+02 3.244e+02 4.441e+02, threshold=5.524e+02, percent-clipped=0.0 2023-09-28 19:23:53,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:23:53,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:55,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 19:24:00,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:24:00,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:24:00,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 19:24:01,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:01,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:24:01,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:02,350 INFO [train.py:1039] (2/4) Epoch 4, batch 1700, loss[loss=0.2848, simple_loss=0.3336, pruned_loss=0.118, over 23439.00 frames. ], tot_loss[loss=0.2683, simple_loss=0.3222, pruned_loss=0.1072, over 4730049.38 frames. ], batch size: 105, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:24:05,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:24:06,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:24:06,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 19:24:10,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:24:10,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=117573.33333333333, ans=0.125 2023-09-28 19:24:20,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:22,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:24:27,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:24:28,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:24:30,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:30,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:24:31,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 19:24:32,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.23 vs. limit=22.5 2023-09-28 19:24:33,557 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:24:34,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:24:34,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:36,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=117706.66666666667, ans=0.125 2023-09-28 19:24:38,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:24:39,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:24:41,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 19:24:43,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 19:24:43,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:45,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 19:24:46,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:24:56,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:24:58,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:24:59,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:25:00,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:25:00,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 19:25:01,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:25:03,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:03,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 19:25:04,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:04,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:04,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:04,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:07,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:07,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:25:09,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:09,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:25:09,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:14,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:16,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 19:25:18,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:18,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:19,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 19:25:24,954 INFO [train.py:1039] (2/4) Epoch 4, batch 1750, loss[loss=0.256, simple_loss=0.319, pruned_loss=0.09649, over 24308.00 frames. ], tot_loss[loss=0.2671, simple_loss=0.3206, pruned_loss=0.1068, over 4717091.54 frames. ], batch size: 61, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:25:26,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:28,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:28,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:25:30,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 19:25:31,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:34,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:25:34,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:39,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 19:25:40,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:44,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 19:25:44,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:46,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:25:48,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:25:51,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 19:25:51,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:53,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 19:26:03,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:26:06,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:06,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:12,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:12,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:13,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=118106.66666666667, ans=0.0 2023-09-28 19:26:14,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:26:15,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:18,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:18,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=118106.66666666667, ans=0.125 2023-09-28 19:26:19,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:26:19,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 19:26:21,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:24,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 19:26:24,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:25,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:27,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:26:32,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:26:32,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:26:32,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:35,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:36,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=118173.33333333333, ans=0.5 2023-09-28 19:26:39,663 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.369e+02 2.693e+02 3.225e+02 5.418e+02, threshold=5.386e+02, percent-clipped=0.0 2023-09-28 19:26:41,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:43,942 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.43 vs. limit=15.0 2023-09-28 19:26:44,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:26:44,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:26:44,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=118173.33333333333, ans=0.125 2023-09-28 19:26:47,474 INFO [train.py:1039] (2/4) Epoch 4, batch 1800, loss[loss=0.2416, simple_loss=0.2955, pruned_loss=0.09386, over 24426.00 frames. ], tot_loss[loss=0.266, simple_loss=0.3197, pruned_loss=0.1061, over 4720506.55 frames. ], batch size: 58, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:26:47,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 19:26:47,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:48,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:26:48,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:26:49,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:26:49,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:26:50,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:26:52,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:26:54,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:55,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:26:58,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:00,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:27:04,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:27:07,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:10,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:10,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:12,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:27:14,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:27:14,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 19:27:16,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:19,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:19,665 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:27:21,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 19:27:24,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 19:27:24,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 19:27:24,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:26,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:26,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:27:27,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:27:33,994 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 19:27:34,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:27:36,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:38,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 19:27:38,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 19:27:40,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:27:41,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:27:41,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:27:46,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 19:27:51,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:27:51,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 19:27:51,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:53,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:53,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:27:53,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 19:27:58,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:27:58,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:27:58,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=118506.66666666667, ans=0.0 2023-09-28 19:28:01,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 19:28:01,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:03,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:04,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:28:04,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:06,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:06,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:28:10,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:28:10,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:11,768 INFO [train.py:1039] (2/4) Epoch 4, batch 1850, loss[loss=0.2818, simple_loss=0.3409, pruned_loss=0.1113, over 24039.00 frames. ], tot_loss[loss=0.2661, simple_loss=0.3201, pruned_loss=0.1061, over 4710624.83 frames. ], batch size: 80, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:28:13,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:28:13,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:28:23,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:28:23,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 19:28:26,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 19:28:29,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 19:28:33,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:28:35,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 19:28:35,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 19:28:39,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=118640.0, ans=0.125 2023-09-28 19:28:44,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:28:46,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 19:28:49,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:28:50,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:28:54,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 19:28:54,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:54,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:28:57,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:28:59,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:29:01,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:04,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:29:04,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:05,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:29:05,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:09,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:09,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:29:14,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 19:29:14,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:19,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:29:19,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:29:19,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 19:29:19,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 19:29:20,950 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 19:29:21,065 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 19:29:24,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:29:24,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:29:24,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:24,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:24,787 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 19:29:24,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:29:26,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:27,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=118840.0, ans=0.0 2023-09-28 19:29:27,995 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.551e+02 2.974e+02 3.413e+02 5.793e+02, threshold=5.947e+02, percent-clipped=2.0 2023-09-28 19:29:28,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:29:28,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:29:29,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:29:29,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 19:29:31,032 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.68 vs. limit=6.0 2023-09-28 19:29:33,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:33,090 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 19:29:33,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:29:34,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:36,064 INFO [train.py:1039] (2/4) Epoch 4, batch 1900, loss[loss=0.3624, simple_loss=0.374, pruned_loss=0.1754, over 19165.00 frames. ], tot_loss[loss=0.2694, simple_loss=0.3224, pruned_loss=0.1083, over 4699112.46 frames. ], batch size: 389, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:29:39,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:41,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:29:43,555 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 19:29:43,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 19:29:45,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:46,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:46,615 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 19:29:46,681 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 19:29:51,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 19:29:52,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.51 vs. limit=5.0 2023-09-28 19:29:54,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:29:57,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 19:30:00,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 19:30:08,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 19:30:11,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 19:30:12,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:30:13,007 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 19:30:13,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 19:30:14,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 19:30:14,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 19:30:14,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:30:20,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 19:30:23,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:30:25,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=119106.66666666667, ans=0.1 2023-09-28 19:30:27,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:27,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 19:30:29,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:30:32,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 19:30:33,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:42,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:30:42,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:30:42,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:30:42,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:30:43,458 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.84 vs. limit=15.0 2023-09-28 19:30:44,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:30:45,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:30:45,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:30:48,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:48,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:30:52,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:30:52,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:54,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:54,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:56,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=119173.33333333333, ans=0.125 2023-09-28 19:30:58,741 INFO [train.py:1039] (2/4) Epoch 4, batch 1950, loss[loss=0.2855, simple_loss=0.3329, pruned_loss=0.119, over 23767.00 frames. ], tot_loss[loss=0.2705, simple_loss=0.3234, pruned_loss=0.1088, over 4704459.76 frames. ], batch size: 212, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:30:58,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:31:01,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:31:03,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:03,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:31:05,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 19:31:06,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:31:07,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:10,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:10,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=119240.0, ans=0.125 2023-09-28 19:31:13,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:31:13,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:13,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:14,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=119306.66666666667, ans=0.125 2023-09-28 19:31:15,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=119306.66666666667, ans=0.125 2023-09-28 19:31:17,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:18,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:31:18,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:31:20,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:31:20,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:24,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:27,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:31:27,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:27,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:31:27,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 19:31:29,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:31:29,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:31:29,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:36,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:37,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:31:42,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:31:46,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:31:46,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:31:46,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 19:31:46,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:31:52,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:52,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:31:52,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:00,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:01,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:06,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:08,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:09,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:32:11,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:11,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 19:32:11,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:32:12,784 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.624e+02 2.939e+02 3.496e+02 6.198e+02, threshold=5.878e+02, percent-clipped=1.0 2023-09-28 19:32:12,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:32:14,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 19:32:16,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:21,589 INFO [train.py:1039] (2/4) Epoch 4, batch 2000, loss[loss=0.2653, simple_loss=0.333, pruned_loss=0.0988, over 24386.00 frames. ], tot_loss[loss=0.2704, simple_loss=0.3232, pruned_loss=0.1088, over 4702632.38 frames. ], batch size: 69, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:32:21,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:23,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:32:23,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:32:23,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=119573.33333333333, ans=0.125 2023-09-28 19:32:26,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:32:26,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:31,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 19:32:31,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:32:34,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:32:34,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=119573.33333333333, ans=0.125 2023-09-28 19:32:36,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 19:32:36,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:32:36,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:39,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:32:41,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 19:32:42,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=119640.0, ans=0.0 2023-09-28 19:32:43,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:47,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 19:32:47,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:32:49,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 19:32:50,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:32:53,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:32:53,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:32:53,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:55,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:32:55,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:32:56,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 19:33:00,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 19:33:00,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:33:00,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:03,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=119706.66666666667, ans=0.125 2023-09-28 19:33:06,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:07,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:33:07,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:09,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:33:11,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:11,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:13,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:13,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:16,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:19,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:33:19,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 19:33:24,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:33:27,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:29,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=119840.0, ans=0.0 2023-09-28 19:33:32,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:32,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:33:35,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:38,871 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=6.51 vs. limit=15.0 2023-09-28 19:33:39,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:39,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:39,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:33:39,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:33:42,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:44,037 INFO [train.py:1039] (2/4) Epoch 4, batch 2050, loss[loss=0.2805, simple_loss=0.3488, pruned_loss=0.1061, over 24315.00 frames. ], tot_loss[loss=0.2687, simple_loss=0.3216, pruned_loss=0.1079, over 4713906.14 frames. ], batch size: 74, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:33:44,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:47,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:48,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:55,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:57,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:33:57,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:57,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.38 vs. limit=15.0 2023-09-28 19:33:58,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:02,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 19:34:02,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:34:02,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:02,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=119973.33333333333, ans=0.125 2023-09-28 19:34:04,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:34:14,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:14,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:17,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 19:34:19,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:20,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 19:34:20,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:24,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:27,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:29,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:34:29,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:30,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:34:31,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:34:31,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:34:36,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:37,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:34:39,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:34:39,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:44,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:34:49,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:50,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 19:34:55,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:34:57,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:34:59,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.560e+02 3.020e+02 3.672e+02 5.923e+02, threshold=6.041e+02, percent-clipped=1.0 2023-09-28 19:35:00,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:35:02,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 19:35:06,670 INFO [train.py:1039] (2/4) Epoch 4, batch 2100, loss[loss=0.2262, simple_loss=0.2844, pruned_loss=0.08396, over 24634.00 frames. ], tot_loss[loss=0.2669, simple_loss=0.3195, pruned_loss=0.1072, over 4716527.77 frames. ], batch size: 60, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:35:06,831 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 19:35:06,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:06,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:08,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:09,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:35:09,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 19:35:10,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 19:35:12,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:35:15,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:35:17,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:35:19,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=120240.0, ans=0.0 2023-09-28 19:35:20,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:22,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:35:22,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 19:35:22,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:35:22,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=120306.66666666667, ans=0.0 2023-09-28 19:35:23,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 19:35:23,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 19:35:25,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:26,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:35:26,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 19:35:26,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 19:35:31,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.35 vs. limit=15.0 2023-09-28 19:35:32,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 19:35:32,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:32,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=120306.66666666667, ans=0.1 2023-09-28 19:35:37,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:35:37,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:40,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:35:40,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 19:35:40,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=120373.33333333333, ans=0.125 2023-09-28 19:35:41,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:41,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:35:43,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 19:35:43,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:43,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 19:35:45,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 19:35:45,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 19:35:49,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:35:50,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:35:52,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:54,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:57,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:58,406 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-09-28 19:35:59,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:59,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 19:35:59,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:59,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:00,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:00,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 19:36:02,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 19:36:02,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 19:36:06,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:36:09,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:36:10,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 19:36:13,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:17,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:36:18,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:36:18,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:36:18,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:36:20,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:36:23,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:23,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:36:23,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:36:23,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:27,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 19:36:30,149 INFO [train.py:1039] (2/4) Epoch 4, batch 2150, loss[loss=0.2498, simple_loss=0.3215, pruned_loss=0.08907, over 24671.00 frames. ], tot_loss[loss=0.2649, simple_loss=0.3182, pruned_loss=0.1058, over 4719903.84 frames. ], batch size: 73, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:36:30,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 19:36:30,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:31,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:31,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:36:31,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:36:33,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:36:37,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:36:38,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:41,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:44,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:36:44,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:44,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:36:47,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:49,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:36:49,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:36:51,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=120640.0, ans=0.125 2023-09-28 19:36:52,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:52,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 19:36:55,620 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.38 vs. limit=10.0 2023-09-28 19:36:56,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=120640.0, ans=0.1 2023-09-28 19:36:57,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:00,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:37:00,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:02,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:02,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:02,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:37:02,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:02,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:37:04,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:37:04,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 19:37:06,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:37:08,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:08,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:09,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=120706.66666666667, ans=0.125 2023-09-28 19:37:10,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:37:12,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:37:15,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:16,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:37:16,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:16,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 19:37:18,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:37:21,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:21,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:22,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:24,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:37:24,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:26,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:26,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 19:37:27,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 19:37:27,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:37:27,859 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 19:37:29,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:30,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:37:31,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 19:37:31,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:37:32,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 19:37:32,909 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 19:37:32,909 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 19:37:32,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 19:37:35,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:36,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:36,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:37:36,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:38,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:37:38,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:40,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:45,483 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.345e+02 2.868e+02 3.477e+02 5.291e+02, threshold=5.737e+02, percent-clipped=0.0 2023-09-28 19:37:45,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=120840.0, ans=0.1 2023-09-28 19:37:46,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=120840.0, ans=0.0 2023-09-28 19:37:47,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:37:48,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 19:37:53,285 INFO [train.py:1039] (2/4) Epoch 4, batch 2200, loss[loss=0.2682, simple_loss=0.3381, pruned_loss=0.09913, over 24533.00 frames. ], tot_loss[loss=0.2642, simple_loss=0.318, pruned_loss=0.1052, over 4725593.44 frames. ], batch size: 71, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:37:53,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:37:58,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:58,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:37:59,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:01,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:38:04,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:38:04,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:38:04,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 19:38:11,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 19:38:11,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=120973.33333333333, ans=0.125 2023-09-28 19:38:13,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:38:20,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 19:38:23,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:25,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:25,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:38:28,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:38:30,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 19:38:31,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:38:31,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=121040.0, ans=0.125 2023-09-28 19:38:34,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:34,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 19:38:38,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:38:39,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:40,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=121040.0, ans=0.0 2023-09-28 19:38:43,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:38:44,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:48,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 19:38:48,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:49,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 19:38:51,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:51,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:38:51,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:55,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:56,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:56,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:56,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:57,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:38:58,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:38:58,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.06 vs. limit=22.5 2023-09-28 19:38:59,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:39:02,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:39:04,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:07,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:39:07,539 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 19:39:10,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:39:12,022 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 19:39:12,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:39:14,067 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 19:39:15,505 INFO [train.py:1039] (2/4) Epoch 4, batch 2250, loss[loss=0.2966, simple_loss=0.325, pruned_loss=0.1341, over 23473.00 frames. ], tot_loss[loss=0.2638, simple_loss=0.3183, pruned_loss=0.1047, over 4737705.32 frames. ], batch size: 285, lr: 2.42e-02, grad_scale: 64.0 2023-09-28 19:39:15,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:17,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:39:18,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=121240.0, ans=0.1 2023-09-28 19:39:19,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:20,742 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 19:39:20,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:39:24,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:30,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:39:32,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:39:38,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:38,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:38,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:40,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 19:39:40,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:39:41,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:39:42,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 19:39:42,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:39:42,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:45,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:50,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:52,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:39:52,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:39:54,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 19:39:57,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:58,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:40:02,030 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.41 vs. limit=15.0 2023-09-28 19:40:02,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:02,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=121373.33333333333, ans=0.1 2023-09-28 19:40:03,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=121373.33333333333, ans=0.07 2023-09-28 19:40:04,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:05,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:05,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:40:08,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:40:10,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:40:13,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:40:16,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:40:21,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:40:21,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:40:21,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:40:25,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.55 vs. limit=10.0 2023-09-28 19:40:27,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:40:30,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:40:30,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 19:40:32,080 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.293e+02 2.607e+02 2.902e+02 3.937e+02, threshold=5.215e+02, percent-clipped=0.0 2023-09-28 19:40:32,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:32,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:40:37,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 19:40:38,984 INFO [train.py:1039] (2/4) Epoch 4, batch 2300, loss[loss=0.3678, simple_loss=0.3846, pruned_loss=0.1755, over 19608.00 frames. ], tot_loss[loss=0.2654, simple_loss=0.3194, pruned_loss=0.1057, over 4723814.13 frames. ], batch size: 388, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:40:39,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:40:39,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=121573.33333333333, ans=0.0 2023-09-28 19:40:40,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:45,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:46,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:40:50,028 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 19:40:51,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:57,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:40:57,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:40:59,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:40:59,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:59,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 19:41:02,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:41:03,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:05,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:41:08,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:41:12,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:41:15,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:20,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:41:21,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:41:26,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:41:27,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:41:30,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:32,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:41:32,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:41:32,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 19:41:37,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:41:37,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:39,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:39,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:41:40,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:40,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=121773.33333333333, ans=0.125 2023-09-28 19:41:42,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 19:41:42,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:41:42,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 19:41:42,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:41:42,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:42,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=121840.0, ans=0.2 2023-09-28 19:41:44,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 19:41:48,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=121840.0, ans=0.125 2023-09-28 19:41:49,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:41:54,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:41:57,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=121840.0, ans=0.125 2023-09-28 19:41:59,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:59,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:41:59,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:42:00,446 INFO [train.py:1039] (2/4) Epoch 4, batch 2350, loss[loss=0.2511, simple_loss=0.313, pruned_loss=0.09459, over 23354.00 frames. ], tot_loss[loss=0.2649, simple_loss=0.3194, pruned_loss=0.1052, over 4726236.85 frames. ], batch size: 93, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:42:00,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:42:00,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:02,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:42:02,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 19:42:10,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:10,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 19:42:17,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 19:42:20,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:42:24,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:24,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:26,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 19:42:26,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=121973.33333333333, ans=0.125 2023-09-28 19:42:29,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:42:34,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 19:42:35,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:37,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=122040.0, ans=0.2 2023-09-28 19:42:37,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=122040.0, ans=0.0 2023-09-28 19:42:38,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:42:40,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:40,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=122040.0, ans=0.0 2023-09-28 19:42:42,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:42:44,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 19:42:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:42:47,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:47,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:42:47,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:42:52,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:42:54,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 19:42:54,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=122106.66666666667, ans=0.0 2023-09-28 19:42:56,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:58,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:58,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:42:59,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 19:43:01,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:43:01,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=122106.66666666667, ans=0.125 2023-09-28 19:43:03,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 19:43:04,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:43:07,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=122173.33333333333, ans=0.125 2023-09-28 19:43:09,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 19:43:10,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 19:43:12,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:43:12,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:43:12,503 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 19:43:12,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 19:43:16,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 19:43:18,092 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.367e+02 2.843e+02 3.356e+02 5.882e+02, threshold=5.686e+02, percent-clipped=1.0 2023-09-28 19:43:18,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:43:22,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:43:24,089 INFO [train.py:1039] (2/4) Epoch 4, batch 2400, loss[loss=0.2596, simple_loss=0.327, pruned_loss=0.0961, over 24547.00 frames. ], tot_loss[loss=0.266, simple_loss=0.32, pruned_loss=0.106, over 4712754.54 frames. ], batch size: 71, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:43:24,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=122240.0, ans=0.0 2023-09-28 19:43:28,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:43:29,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:43:31,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 19:43:31,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 19:43:39,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:43:39,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:43:40,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 19:43:40,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:43:42,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:42,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 19:43:49,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:52,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 19:43:57,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:44:00,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=122373.33333333333, ans=0.125 2023-09-28 19:44:02,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 19:44:04,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:07,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:08,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=122373.33333333333, ans=0.0 2023-09-28 19:44:12,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:12,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 19:44:12,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:44:15,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=122440.0, ans=0.2 2023-09-28 19:44:18,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:22,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:44:24,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:25,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:44:25,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:44:27,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:44:27,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:27,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:29,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:44:32,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:44:34,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:44:34,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 19:44:36,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 19:44:36,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=122506.66666666667, ans=0.0 2023-09-28 19:44:39,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:39,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:39,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 19:44:41,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 19:44:41,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 19:44:41,804 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 19:44:43,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 19:44:44,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:45,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:45,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:46,593 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 19:44:46,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:48,026 INFO [train.py:1039] (2/4) Epoch 4, batch 2450, loss[loss=0.2515, simple_loss=0.3123, pruned_loss=0.09537, over 24658.00 frames. ], tot_loss[loss=0.2652, simple_loss=0.3192, pruned_loss=0.1056, over 4711872.79 frames. ], batch size: 65, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:44:48,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:44:51,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:44:51,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:56,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:56,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:58,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 19:45:03,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:03,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:06,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:45:07,500 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.55 vs. limit=15.0 2023-09-28 19:45:08,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:45:08,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:45:08,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 19:45:10,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=122640.0, ans=0.0 2023-09-28 19:45:13,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:15,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:45:15,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=122640.0, ans=0.0 2023-09-28 19:45:16,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:45:20,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:45:20,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:23,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:23,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:45:24,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 19:45:26,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:45:29,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=122706.66666666667, ans=0.1 2023-09-28 19:45:34,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:35,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=122706.66666666667, ans=0.2 2023-09-28 19:45:37,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:37,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:37,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:45:38,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:39,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:45:39,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 19:45:45,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:45,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:45:45,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=122773.33333333333, ans=0.0 2023-09-28 19:45:48,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:45:48,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:49,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=122773.33333333333, ans=0.0 2023-09-28 19:45:54,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:45:54,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 19:45:55,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=122840.0, ans=0.0 2023-09-28 19:45:56,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:45:56,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:56,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 19:45:57,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:45:58,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:45:59,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=122840.0, ans=0.125 2023-09-28 19:46:02,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:46:03,794 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.418e+02 2.743e+02 3.147e+02 4.422e+02, threshold=5.485e+02, percent-clipped=0.0 2023-09-28 19:46:05,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:05,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:46:08,569 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.04 vs. limit=15.0 2023-09-28 19:46:09,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 19:46:09,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:46:11,567 INFO [train.py:1039] (2/4) Epoch 4, batch 2500, loss[loss=0.2618, simple_loss=0.3282, pruned_loss=0.09764, over 24548.00 frames. ], tot_loss[loss=0.2644, simple_loss=0.3182, pruned_loss=0.1053, over 4703998.01 frames. ], batch size: 71, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:46:19,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:28,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:46:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:46:29,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:29,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 19:46:35,082 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.77 vs. limit=15.0 2023-09-28 19:46:37,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:46:37,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:46:38,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:46:38,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:46:40,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 19:46:40,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:42,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:44,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 19:46:44,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:44,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 19:46:44,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:46:50,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:52,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:56,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:46:56,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 19:46:56,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:46:59,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:03,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=123106.66666666667, ans=0.5 2023-09-28 19:47:04,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:06,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=123106.66666666667, ans=0.125 2023-09-28 19:47:07,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:12,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:15,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:47:17,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 19:47:17,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:19,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:47:20,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:47:20,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:47:22,711 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 19:47:22,712 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 19:47:22,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 19:47:23,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=123173.33333333333, ans=0.125 2023-09-28 19:47:23,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=123173.33333333333, ans=0.125 2023-09-28 19:47:25,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:29,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 19:47:29,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 19:47:29,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:29,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=123173.33333333333, ans=0.1 2023-09-28 19:47:31,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 19:47:31,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=123173.33333333333, ans=0.125 2023-09-28 19:47:34,347 INFO [train.py:1039] (2/4) Epoch 4, batch 2550, loss[loss=0.2881, simple_loss=0.3303, pruned_loss=0.123, over 23495.00 frames. ], tot_loss[loss=0.2646, simple_loss=0.3187, pruned_loss=0.1052, over 4701309.80 frames. ], batch size: 285, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:47:36,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 19:47:37,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:38,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=123240.0, ans=0.125 2023-09-28 19:47:39,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:47:40,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:47:42,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:44,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 19:47:45,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:47:48,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 19:47:49,448 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.13 vs. limit=22.5 2023-09-28 19:47:50,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:47:52,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=123306.66666666667, ans=0.0 2023-09-28 19:47:53,925 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.02 vs. limit=15.0 2023-09-28 19:47:54,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:56,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 19:47:56,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:47:58,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:47:58,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:48:02,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:48:02,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 19:48:02,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:48:02,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:02,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 19:48:13,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:48:19,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:19,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:19,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:48:21,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:48:27,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:48:30,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:48:30,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:48:31,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:48:31,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:48:32,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:48:36,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:38,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:44,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:48:44,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 19:48:44,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:48:44,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:46,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:48:47,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:48:47,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:48,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=123506.66666666667, ans=0.125 2023-09-28 19:48:48,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=123506.66666666667, ans=0.1 2023-09-28 19:48:49,402 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.344e+02 2.648e+02 3.019e+02 5.195e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 19:48:51,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=123506.66666666667, ans=0.2 2023-09-28 19:48:54,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:48:55,607 INFO [train.py:1039] (2/4) Epoch 4, batch 2600, loss[loss=0.2872, simple_loss=0.3254, pruned_loss=0.1245, over 23388.00 frames. ], tot_loss[loss=0.2664, simple_loss=0.3204, pruned_loss=0.1062, over 4706910.95 frames. ], batch size: 285, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:48:55,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:59,002 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 19:49:02,766 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 19:49:02,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:49:04,146 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 19:49:04,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 19:49:04,298 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 19:49:07,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:49:07,421 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 19:49:09,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 19:49:11,064 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 19:49:12,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:49:14,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 19:49:16,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 19:49:17,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:49:18,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 19:49:20,942 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 19:49:20,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 19:49:27,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:27,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:27,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:27,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 19:49:28,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:49:37,278 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 19:49:41,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:41,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:43,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 19:49:44,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:49:44,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:44,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 19:49:50,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:49:50,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:49:53,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,306 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 19:49:56,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:50:01,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:50:01,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:50:01,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 19:50:04,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:50:05,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:07,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:14,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 19:50:16,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:18,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:50:19,989 INFO [train.py:1039] (2/4) Epoch 4, batch 2650, loss[loss=0.2617, simple_loss=0.3087, pruned_loss=0.1073, over 23793.00 frames. ], tot_loss[loss=0.2678, simple_loss=0.3214, pruned_loss=0.1071, over 4696473.06 frames. ], batch size: 164, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:50:21,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 19:50:21,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:21,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:50:24,073 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 19:50:24,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:50:27,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:30,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:50:30,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:30,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=123906.66666666667, ans=0.07 2023-09-28 19:50:33,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:50:33,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 19:50:33,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:50:33,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:50:37,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 19:50:40,108 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 19:50:43,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:48,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 19:50:48,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:50:48,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 19:50:53,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:50:53,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:00,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=124040.0, ans=0.2 2023-09-28 19:51:01,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 19:51:01,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 19:51:02,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:07,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 19:51:07,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:09,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:09,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:11,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:11,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:14,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:15,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:17,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:51:17,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:51:19,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:51:20,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:21,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:51:23,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:24,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:24,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:51:27,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:29,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:51:29,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:31,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 19:51:35,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:35,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:37,252 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.373e+02 3.005e+02 3.591e+02 5.745e+02, threshold=6.010e+02, percent-clipped=4.0 2023-09-28 19:51:39,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:40,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:42,342 INFO [train.py:1039] (2/4) Epoch 4, batch 2700, loss[loss=0.2927, simple_loss=0.3298, pruned_loss=0.1278, over 22625.00 frames. ], tot_loss[loss=0.2692, simple_loss=0.3223, pruned_loss=0.1081, over 4695028.52 frames. ], batch size: 322, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:51:42,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:42,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:44,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:51:44,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 19:51:44,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=124240.0, ans=0.2 2023-09-28 19:51:48,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:51:50,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 19:51:51,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:51,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:53,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:55,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:51:55,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:55,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:51:56,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:51:57,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 19:51:57,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:51:58,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:58,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:52:00,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:05,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:52:05,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 19:52:07,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:52:12,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:52:12,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:52:18,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:52:18,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:52:18,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:52:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:52:22,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:25,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.64 vs. limit=22.5 2023-09-28 19:52:26,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.55 vs. limit=15.0 2023-09-28 19:52:26,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:26,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:52:26,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:52:31,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:31,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:52:40,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:52:40,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:52:44,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=124440.0, ans=0.2 2023-09-28 19:52:45,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:52:45,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:52:48,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:48,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:50,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:52,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:53,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=124506.66666666667, ans=0.0 2023-09-28 19:52:53,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=124506.66666666667, ans=0.125 2023-09-28 19:52:55,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:55,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:52:58,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:53:00,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:00,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:02,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 19:53:03,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:05,056 INFO [train.py:1039] (2/4) Epoch 4, batch 2750, loss[loss=0.2189, simple_loss=0.2845, pruned_loss=0.07662, over 24590.00 frames. ], tot_loss[loss=0.269, simple_loss=0.3218, pruned_loss=0.1081, over 4687355.57 frames. ], batch size: 60, lr: 2.39e-02, grad_scale: 16.0 2023-09-28 19:53:06,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:53:06,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 19:53:08,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 19:53:08,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:10,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:10,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:15,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:15,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:53:15,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:18,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:20,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:53:20,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:53:20,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:20,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 19:53:20,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:53:20,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:22,535 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.57 vs. limit=15.0 2023-09-28 19:53:27,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 19:53:28,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=124640.0, ans=0.125 2023-09-28 19:53:30,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:53:30,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:30,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:53:32,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:53:32,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:33,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:53:35,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:35,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:38,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:53:38,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:53:39,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:53:40,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:41,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=124706.66666666667, ans=0.125 2023-09-28 19:53:43,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:53:50,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:52,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:53:52,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:56,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:56,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:53:58,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:54:04,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:54:04,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:54:04,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 19:54:07,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=124773.33333333333, ans=0.0 2023-09-28 19:54:10,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:11,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 19:54:17,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:54:19,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:54:19,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 19:54:20,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:54:22,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:54:24,072 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.875e+02 2.756e+02 3.214e+02 3.920e+02 6.552e+02, threshold=6.428e+02, percent-clipped=3.0 2023-09-28 19:54:24,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 19:54:25,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:54:28,568 INFO [train.py:1039] (2/4) Epoch 4, batch 2800, loss[loss=0.2618, simple_loss=0.3059, pruned_loss=0.1088, over 23619.00 frames. ], tot_loss[loss=0.2671, simple_loss=0.3194, pruned_loss=0.1074, over 4674048.71 frames. ], batch size: 232, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:54:28,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 19:54:29,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=124906.66666666667, ans=0.125 2023-09-28 19:54:30,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:30,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:54:30,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 19:54:30,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:31,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:33,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:33,497 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 19:54:33,498 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 19:54:38,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:40,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:54:41,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:54:43,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:54:47,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 19:54:48,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 19:54:50,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 19:54:50,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:50,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:54:50,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:54:55,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:54:55,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:55,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:54:57,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:01,557 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.28 vs. limit=12.0 2023-09-28 19:55:07,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:55:08,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:55:11,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:13,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:55:13,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:19,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:19,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 19:55:21,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:21,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:23,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:55:26,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:26,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:32,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:36,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:55:36,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:36,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:55:36,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:55:37,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:55:37,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:55:37,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 19:55:39,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:40,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:40,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:42,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 19:55:42,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:42,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:55:43,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:55:44,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 19:55:52,386 INFO [train.py:1039] (2/4) Epoch 4, batch 2850, loss[loss=0.2796, simple_loss=0.322, pruned_loss=0.1186, over 23691.00 frames. ], tot_loss[loss=0.2658, simple_loss=0.3187, pruned_loss=0.1064, over 4677850.91 frames. ], batch size: 135, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:55:52,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:52,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:55:54,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:55:57,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:55:59,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=125240.0, ans=0.2 2023-09-28 19:56:00,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:00,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:00,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:56:03,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:04,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:56:05,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:56:05,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=125240.0, ans=0.0 2023-09-28 19:56:07,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 19:56:15,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 19:56:15,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:17,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 19:56:17,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:21,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 19:56:21,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 19:56:22,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:35,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:37,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:37,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:37,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:56:37,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:56:38,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:56:40,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:56:40,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 19:56:43,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:56:43,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:56:45,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:45,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:47,891 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.11 vs. limit=22.5 2023-09-28 19:56:48,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:48,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:50,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:52,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:53,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=125440.0, ans=0.2 2023-09-28 19:56:55,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:56:55,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:56,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:58,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:57:00,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=125506.66666666667, ans=0.1 2023-09-28 19:57:03,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:57:03,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=125506.66666666667, ans=0.125 2023-09-28 19:57:04,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 19:57:05,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 19:57:06,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:57:07,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:08,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 19:57:09,329 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.413e+02 2.730e+02 3.344e+02 4.987e+02, threshold=5.460e+02, percent-clipped=0.0 2023-09-28 19:57:09,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:57:10,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:10,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:10,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:57:10,959 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 19:57:12,314 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 19:57:12,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:13,703 INFO [train.py:1039] (2/4) Epoch 4, batch 2900, loss[loss=0.2908, simple_loss=0.334, pruned_loss=0.1238, over 22774.00 frames. ], tot_loss[loss=0.2648, simple_loss=0.3181, pruned_loss=0.1058, over 4681789.36 frames. ], batch size: 322, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:57:13,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:18,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:18,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:20,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:57:20,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 19:57:20,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=125573.33333333333, ans=0.125 2023-09-28 19:57:25,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:25,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 19:57:26,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 19:57:28,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=125640.0, ans=0.0 2023-09-28 19:57:30,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:57:30,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:57:30,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:32,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:57:36,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:37,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:39,959 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.66 vs. limit=22.5 2023-09-28 19:57:40,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:57:40,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 19:57:42,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:57:43,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:45,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 19:57:47,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 19:57:50,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:50,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 19:57:50,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:57:53,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:57:53,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:54,228 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=11.03 vs. limit=10.0 2023-09-28 19:57:56,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:56,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:01,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:58:03,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:07,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 19:58:07,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 19:58:07,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:58:08,135 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:58:10,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:58:11,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=125773.33333333333, ans=0.125 2023-09-28 19:58:15,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 19:58:15,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:58:19,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:23,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=125840.0, ans=0.0 2023-09-28 19:58:27,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:58:27,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:58:29,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 19:58:32,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.54 vs. limit=22.5 2023-09-28 19:58:32,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:34,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 19:58:34,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:34,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:58:35,914 INFO [train.py:1039] (2/4) Epoch 4, batch 2950, loss[loss=0.2317, simple_loss=0.2903, pruned_loss=0.08651, over 24442.00 frames. ], tot_loss[loss=0.2657, simple_loss=0.3196, pruned_loss=0.1059, over 4689772.69 frames. ], batch size: 58, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:58:36,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=125906.66666666667, ans=0.125 2023-09-28 19:58:37,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=125906.66666666667, ans=0.125 2023-09-28 19:58:41,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:44,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 19:58:44,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:44,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:46,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:58:48,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:58:48,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 19:58:48,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 19:58:50,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:58:50,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:57,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:58:59,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:59:01,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:01,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:04,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:04,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:59:06,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:59:09,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=126040.0, ans=0.125 2023-09-28 19:59:12,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 19:59:16,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 19:59:16,646 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 19:59:16,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:59:18,223 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 19:59:20,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 19:59:21,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:21,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:59:21,113 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 19:59:21,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:59:22,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 19:59:24,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:59:25,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:59:27,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:28,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:59:28,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:30,270 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 19:59:31,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:31,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 19:59:36,031 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:59:37,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:38,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:59:40,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 19:59:40,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:59:41,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 19:59:45,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:46,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:46,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:59:46,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=126173.33333333333, ans=0.0 2023-09-28 19:59:50,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:50,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:59:52,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:59:53,509 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.776e+02 2.372e+02 2.758e+02 3.353e+02 4.666e+02, threshold=5.516e+02, percent-clipped=0.0 2023-09-28 19:59:53,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:53,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:59:53,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:59:53,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:55,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:59:55,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.81 vs. limit=15.0 2023-09-28 19:59:56,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:56,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 19:59:58,286 INFO [train.py:1039] (2/4) Epoch 4, batch 3000, loss[loss=0.257, simple_loss=0.3238, pruned_loss=0.09516, over 24349.00 frames. ], tot_loss[loss=0.266, simple_loss=0.3203, pruned_loss=0.1058, over 4699549.88 frames. ], batch size: 74, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:59:58,286 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 20:00:13,215 INFO [train.py:1071] (2/4) Epoch 4, validation: loss=0.3352, simple_loss=0.3262, pruned_loss=0.1721, over 1125622.00 frames. 2023-09-28 20:00:13,216 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 20:00:13,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:00:15,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:00:16,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:00:19,664 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 20:00:19,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 20:00:19,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=126240.0, ans=0.125 2023-09-28 20:00:23,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:00:23,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:00:25,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 20:00:25,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:30,838 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.53 vs. limit=22.5 2023-09-28 20:00:31,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:00:40,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:00:46,801 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.60 vs. limit=6.0 2023-09-28 20:00:47,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 20:00:49,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:00:52,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:00:52,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:52,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:00:56,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:00:56,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 20:00:59,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 20:01:01,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:01:02,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:01:05,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:01:05,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:05,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:05,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:01:11,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:01:11,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:01:11,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:01:13,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:15,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 20:01:16,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:01:16,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:16,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:01:22,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:22,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:23,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:01:23,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 20:01:25,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:01:25,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 20:01:26,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:01:28,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 20:01:28,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=126506.66666666667, ans=0.125 2023-09-28 20:01:32,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:01:32,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:01:32,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 20:01:34,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 20:01:34,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:01:35,616 INFO [train.py:1039] (2/4) Epoch 4, batch 3050, loss[loss=0.2821, simple_loss=0.3228, pruned_loss=0.1207, over 22805.00 frames. ], tot_loss[loss=0.2661, simple_loss=0.3204, pruned_loss=0.106, over 4700932.66 frames. ], batch size: 322, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 20:01:35,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:01:37,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:37,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:01:37,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:37,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:01:37,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=126573.33333333333, ans=0.2 2023-09-28 20:01:40,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 20:01:43,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:01:44,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:01:44,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:01:46,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=126573.33333333333, ans=0.0 2023-09-28 20:01:48,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:51,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 20:01:56,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 20:01:56,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 20:01:56,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:00,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:02:05,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:05,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:07,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:08,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=126706.66666666667, ans=0.125 2023-09-28 20:02:10,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:10,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:02:10,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:12,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:12,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:12,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:15,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:19,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:19,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 20:02:19,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:19,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:02:20,868 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.44 vs. limit=15.0 2023-09-28 20:02:21,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=126706.66666666667, ans=0.125 2023-09-28 20:02:22,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:02:24,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:02:24,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:02:25,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:32,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:32,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:41,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:41,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:02:41,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:41,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:43,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:02:43,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:44,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 20:02:45,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:45,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:47,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 20:02:48,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:52,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:53,685 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.370e+02 2.708e+02 3.419e+02 5.330e+02, threshold=5.417e+02, percent-clipped=0.0 2023-09-28 20:02:53,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:02:56,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:02:58,153 INFO [train.py:1039] (2/4) Epoch 4, batch 3100, loss[loss=0.261, simple_loss=0.3019, pruned_loss=0.11, over 23710.00 frames. ], tot_loss[loss=0.2666, simple_loss=0.3205, pruned_loss=0.1064, over 4696729.22 frames. ], batch size: 232, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:02:58,633 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:02:59,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 20:02:59,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=126906.66666666667, ans=0.125 2023-09-28 20:03:01,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 20:03:01,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 20:03:04,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:03:07,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:03:07,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:09,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:03:11,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=126906.66666666667, ans=0.1 2023-09-28 20:03:14,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:16,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=126973.33333333333, ans=0.0 2023-09-28 20:03:17,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=126973.33333333333, ans=0.0 2023-09-28 20:03:20,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 20:03:27,448 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=16.82 vs. limit=15.0 2023-09-28 20:03:28,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:03:28,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:29,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:03:29,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:03:30,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=127040.0, ans=0.125 2023-09-28 20:03:31,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:03:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:03:33,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 20:03:33,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:03:34,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:36,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 20:03:36,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:03:38,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=127040.0, ans=0.125 2023-09-28 20:03:41,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:03:41,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 20:03:43,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 20:03:45,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:45,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:48,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:03:48,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:48,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:03:52,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:03:52,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:03:55,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:03:55,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:03:55,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:55,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:03:59,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:04:01,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 20:04:02,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:04:02,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 20:04:04,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:04,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:04,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 20:04:17,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 20:04:21,075 INFO [train.py:1039] (2/4) Epoch 4, batch 3150, loss[loss=0.2538, simple_loss=0.3267, pruned_loss=0.09047, over 24295.00 frames. ], tot_loss[loss=0.2648, simple_loss=0.3191, pruned_loss=0.1053, over 4697701.08 frames. ], batch size: 74, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:04:21,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:21,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:22,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:04:22,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:04:24,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 20:04:24,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:24,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:04:26,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 20:04:29,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:32,631 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 20:04:35,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 20:04:35,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:04:37,539 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 20:04:37,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:04:39,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 20:04:39,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 20:04:39,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 20:04:39,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:41,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:04:43,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:44,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 20:04:47,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:47,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:48,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:50,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:04:53,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 20:04:54,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:04:56,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:04:58,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:58,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 20:05:01,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 20:05:03,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:05:03,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:05:03,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:05:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:04,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:05:04,968 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:05:04,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=127373.33333333333, ans=0.125 2023-09-28 20:05:06,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:05:06,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:05:07,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 20:05:07,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:05:07,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:09,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:05:09,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:05:11,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 20:05:11,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:12,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 20:05:12,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:12,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 20:05:14,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 20:05:16,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=127440.0, ans=0.0 2023-09-28 20:05:17,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:05:17,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:19,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 20:05:21,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:05:21,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:25,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:05:26,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:27,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:05:33,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:05:34,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:38,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 20:05:39,688 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.347e+02 2.789e+02 3.421e+02 6.245e+02, threshold=5.579e+02, percent-clipped=5.0 2023-09-28 20:05:43,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:05:43,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 20:05:44,422 INFO [train.py:1039] (2/4) Epoch 4, batch 3200, loss[loss=0.2677, simple_loss=0.3312, pruned_loss=0.1021, over 24063.00 frames. ], tot_loss[loss=0.2629, simple_loss=0.3174, pruned_loss=0.1041, over 4710533.91 frames. ], batch size: 80, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:05:48,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:49,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:05:49,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 20:05:52,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:57,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:06:01,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=127640.0, ans=0.125 2023-09-28 20:06:02,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:06:12,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:06:19,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=127706.66666666667, ans=0.125 2023-09-28 20:06:23,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 20:06:23,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:06:27,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 20:06:28,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:06:32,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:06:32,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:06:35,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:06:37,841 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-09-28 20:06:38,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 20:06:41,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:06:42,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=127773.33333333333, ans=0.125 2023-09-28 20:06:43,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 20:06:45,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 20:06:48,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:06:49,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=127840.0, ans=0.1 2023-09-28 20:06:54,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:54,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:06:54,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:55,556 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 20:06:55,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:06:59,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:06:59,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 20:07:01,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 20:07:01,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 20:07:02,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 20:07:05,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:07:07,645 INFO [train.py:1039] (2/4) Epoch 4, batch 3250, loss[loss=0.3171, simple_loss=0.3338, pruned_loss=0.1502, over 19371.00 frames. ], tot_loss[loss=0.2626, simple_loss=0.3175, pruned_loss=0.1039, over 4721673.99 frames. ], batch size: 389, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:07:09,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:07:09,366 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 20:07:09,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:09,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:12,392 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 20:07:16,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:07:17,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:22,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=127973.33333333333, ans=0.125 2023-09-28 20:07:28,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:07:28,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 20:07:30,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:30,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:07:32,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:32,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:32,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:07:34,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:34,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:07:34,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:36,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:07:37,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:39,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:41,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:41,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:43,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:44,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:44,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:07:50,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.02 vs. limit=15.0 2023-09-28 20:07:51,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 20:07:51,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:52,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:07:52,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:55,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:07:59,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:08:06,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:06,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:06,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 20:08:06,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:08:07,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:08:07,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:10,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 20:08:11,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 20:08:11,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:08:11,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=128106.66666666667, ans=0.0 2023-09-28 20:08:13,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:13,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:14,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:08:16,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:19,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:19,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:21,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 20:08:21,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:23,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:08:23,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 20:08:25,978 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.244e+02 2.605e+02 3.006e+02 4.571e+02, threshold=5.210e+02, percent-clipped=0.0 2023-09-28 20:08:26,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:26,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 20:08:27,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 20:08:29,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 20:08:29,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:30,616 INFO [train.py:1039] (2/4) Epoch 4, batch 3300, loss[loss=0.2736, simple_loss=0.3123, pruned_loss=0.1175, over 23809.00 frames. ], tot_loss[loss=0.2632, simple_loss=0.3181, pruned_loss=0.1041, over 4715910.27 frames. ], batch size: 164, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:08:35,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:35,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:08:37,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:39,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:08:39,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:08:42,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:42,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=128240.0, ans=0.125 2023-09-28 20:08:43,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:49,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 20:08:49,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:08:49,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:50,984 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:08:52,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:52,744 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 20:08:55,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:08:55,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:08:57,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:08:57,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:08:57,269 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 20:09:00,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:00,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:09:02,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:02,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 20:09:04,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:09:04,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:06,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:09:06,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=128373.33333333333, ans=0.125 2023-09-28 20:09:08,030 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 20:09:10,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 20:09:11,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:09:14,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.65 vs. limit=12.0 2023-09-28 20:09:15,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 20:09:18,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:18,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=128440.0, ans=0.125 2023-09-28 20:09:19,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:09:21,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:09:24,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:25,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:25,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:25,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:09:28,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:09:28,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:30,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:09:31,880 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 20:09:34,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 20:09:37,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:09:37,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=128506.66666666667, ans=0.125 2023-09-28 20:09:39,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:09:39,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:40,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:40,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:43,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:09:43,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:43,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:09:45,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:46,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:09:47,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.20 vs. limit=15.0 2023-09-28 20:09:50,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 20:09:50,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:09:50,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:51,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.35 vs. limit=15.0 2023-09-28 20:09:51,893 INFO [train.py:1039] (2/4) Epoch 4, batch 3350, loss[loss=0.2753, simple_loss=0.3201, pruned_loss=0.1152, over 23900.00 frames. ], tot_loss[loss=0.2648, simple_loss=0.3193, pruned_loss=0.1052, over 4717681.59 frames. ], batch size: 196, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:09:53,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:09:53,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:55,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:56,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:56,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:00,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:10:00,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:02,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=128573.33333333333, ans=0.2 2023-09-28 20:10:04,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:10:06,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:09,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:10:09,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:10,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:10:10,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 20:10:13,820 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 20:10:13,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:15,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 20:10:15,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 20:10:15,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=128640.0, ans=0.0 2023-09-28 20:10:16,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:10:18,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:10:19,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:20,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 20:10:21,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:21,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:10:23,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:25,547 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:10:26,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:26,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:28,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:10:32,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:33,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:33,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:37,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:10:37,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:40,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:40,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:43,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:44,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 20:10:45,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:10:45,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 20:10:46,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:10:46,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 20:10:48,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:49,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:57,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:58,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 20:10:58,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:10:59,241 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:11:01,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:11:03,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:11:08,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:09,997 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.829e+02 2.437e+02 2.848e+02 3.538e+02 5.302e+02, threshold=5.697e+02, percent-clipped=3.0 2023-09-28 20:11:11,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 20:11:11,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:11:11,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:11:14,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:14,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 20:11:14,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:11:14,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 20:11:15,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=128906.66666666667, ans=0.0 2023-09-28 20:11:16,247 INFO [train.py:1039] (2/4) Epoch 4, batch 3400, loss[loss=0.2918, simple_loss=0.333, pruned_loss=0.1253, over 23328.00 frames. ], tot_loss[loss=0.2659, simple_loss=0.3203, pruned_loss=0.1058, over 4709004.99 frames. ], batch size: 105, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:11:16,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:16,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:18,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:11:18,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:11:18,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 20:11:24,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 20:11:24,288 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 20:11:24,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:11:30,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:30,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:11:31,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:33,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:11:34,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=128973.33333333333, ans=0.2 2023-09-28 20:11:37,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:11:40,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 20:11:45,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:11:46,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=128973.33333333333, ans=0.1 2023-09-28 20:11:47,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:47,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:48,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:11:56,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:12:01,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 20:12:06,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 20:12:07,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:09,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:12:11,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:12:11,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:12:13,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:12:16,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:12:16,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:12:23,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:25,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 20:12:34,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:12:37,269 INFO [train.py:1039] (2/4) Epoch 4, batch 3450, loss[loss=0.2524, simple_loss=0.3238, pruned_loss=0.0905, over 24549.00 frames. ], tot_loss[loss=0.2661, simple_loss=0.3201, pruned_loss=0.1061, over 4701135.74 frames. ], batch size: 71, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:12:39,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 20:12:42,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 20:12:42,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:43,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:12:43,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 20:12:45,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:51,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:12:55,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:12:57,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:12:59,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:12:59,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:01,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:08,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 20:13:12,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 20:13:14,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:13:14,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:13:15,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:22,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 20:13:22,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:13:25,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:26,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.66 vs. limit=10.0 2023-09-28 20:13:27,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:13:30,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:13:30,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:13:32,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 20:13:32,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:13:33,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:34,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=129440.0, ans=0.125 2023-09-28 20:13:37,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:13:40,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 20:13:43,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:13:45,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=129506.66666666667, ans=0.0 2023-09-28 20:13:47,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.27 vs. limit=15.0 2023-09-28 20:13:48,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:13:50,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:50,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=129506.66666666667, ans=0.125 2023-09-28 20:13:51,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:13:55,852 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.310e+02 2.657e+02 3.151e+02 5.022e+02, threshold=5.313e+02, percent-clipped=0.0 2023-09-28 20:13:57,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:57,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:57,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:13:58,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=129506.66666666667, ans=15.0 2023-09-28 20:13:59,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:14:01,160 INFO [train.py:1039] (2/4) Epoch 4, batch 3500, loss[loss=0.2609, simple_loss=0.2958, pruned_loss=0.113, over 22701.00 frames. ], tot_loss[loss=0.2645, simple_loss=0.318, pruned_loss=0.1055, over 4691889.42 frames. ], batch size: 322, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:14:04,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:05,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:14:08,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 20:14:10,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.57 vs. limit=15.0 2023-09-28 20:14:10,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:14:12,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:14:15,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:15,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 20:14:22,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:14:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:14:23,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:14:23,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:25,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:14:25,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:25,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:25,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 20:14:29,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:31,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:14:32,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:35,581 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.92 vs. limit=22.5 2023-09-28 20:14:36,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=129706.66666666667, ans=0.2 2023-09-28 20:14:37,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:37,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 20:14:37,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:40,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:43,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:14:43,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:45,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:14:45,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:48,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 20:14:48,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 20:14:49,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 20:14:51,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:51,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:52,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:52,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:14:53,617 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.55 vs. limit=6.0 2023-09-28 20:14:55,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:14:57,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:15:04,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:06,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 20:15:06,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 20:15:06,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:07,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:09,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:11,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:14,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 20:15:14,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:14,731 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:15:15,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:15:18,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 20:15:19,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 20:15:21,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:22,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:22,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:23,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:23,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=129906.66666666667, ans=0.125 2023-09-28 20:15:24,327 INFO [train.py:1039] (2/4) Epoch 4, batch 3550, loss[loss=0.283, simple_loss=0.3134, pruned_loss=0.1263, over 23501.00 frames. ], tot_loss[loss=0.2631, simple_loss=0.3163, pruned_loss=0.1049, over 4688076.69 frames. ], batch size: 285, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:15:27,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:15:34,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:37,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 20:15:41,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:43,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:15:45,483 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:15:46,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:46,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:15:46,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:15:51,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:51,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:15:51,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=129973.33333333333, ans=0.0 2023-09-28 20:15:52,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:52,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:15:53,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:15:59,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:15:59,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:16:00,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=130040.0, ans=0.1 2023-09-28 20:16:01,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:01,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:02,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:16:02,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 20:16:02,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:03,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=130040.0, ans=10.0 2023-09-28 20:16:04,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:05,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:16:09,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:11,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:16:13,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:16,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 20:16:18,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:16:18,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 20:16:19,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:21,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:16:21,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:16:24,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 20:16:24,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:27,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=130106.66666666667, ans=0.125 2023-09-28 20:16:31,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:32,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 20:16:32,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:36,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:37,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=130173.33333333333, ans=0.2 2023-09-28 20:16:39,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 20:16:42,187 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.286e+02 2.757e+02 3.216e+02 5.394e+02, threshold=5.514e+02, percent-clipped=1.0 2023-09-28 20:16:42,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=130173.33333333333, ans=0.5 2023-09-28 20:16:44,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 20:16:44,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:16:46,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:16:46,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:48,264 INFO [train.py:1039] (2/4) Epoch 4, batch 3600, loss[loss=0.2501, simple_loss=0.3035, pruned_loss=0.09833, over 22108.00 frames. ], tot_loss[loss=0.262, simple_loss=0.316, pruned_loss=0.104, over 4699187.88 frames. ], batch size: 48, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:16:49,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:49,364 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:16:49,757 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=18.33 vs. limit=15.0 2023-09-28 20:16:50,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:16:50,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=130240.0, ans=0.0 2023-09-28 20:16:54,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:56,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:57,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:16:58,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=130240.0, ans=10.0 2023-09-28 20:16:59,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:17:01,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:01,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 20:17:04,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:17:06,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:10,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:13,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:15,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:17:15,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:17:15,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 20:17:17,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:20,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:20,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:17:22,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:24,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:26,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:17:28,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 20:17:33,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=130373.33333333333, ans=0.0 2023-09-28 20:17:35,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:17:35,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:17:36,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 20:17:40,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=130440.0, ans=0.125 2023-09-28 20:17:41,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:17:45,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:47,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:52,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=130506.66666666667, ans=0.0 2023-09-28 20:17:54,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:17:54,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:17:54,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 20:17:57,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 20:18:00,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 20:18:02,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:18:02,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:18:03,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 20:18:05,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:05,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:18:05,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:07,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 20:18:07,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 20:18:10,876 INFO [train.py:1039] (2/4) Epoch 4, batch 3650, loss[loss=0.3483, simple_loss=0.3642, pruned_loss=0.1662, over 19385.00 frames. ], tot_loss[loss=0.2627, simple_loss=0.3169, pruned_loss=0.1043, over 4706345.10 frames. ], batch size: 388, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:18:11,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:18:11,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 20:18:17,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 20:18:18,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:18:23,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 20:18:24,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 20:18:28,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=130640.0, ans=0.0 2023-09-28 20:18:29,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:18:29,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:18:29,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:18:35,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:18:35,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:36,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=130640.0, ans=0.125 2023-09-28 20:18:37,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 20:18:38,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:18:39,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:39,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 20:18:39,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:18:41,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:18:41,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:18:43,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:18:46,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 20:18:47,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 20:18:49,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:18:52,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 20:18:53,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:18:53,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:18:58,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:19:00,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:00,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:19:02,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:19:03,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:19:04,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:19:06,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:09,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:09,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:19:11,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:19:12,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=130773.33333333333, ans=0.125 2023-09-28 20:19:14,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:14,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:21,078 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 20:19:22,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:24,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:25,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:19:25,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:27,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:19:28,544 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.316e+02 2.706e+02 3.127e+02 4.745e+02, threshold=5.412e+02, percent-clipped=0.0 2023-09-28 20:19:28,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:30,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 20:19:30,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:33,346 INFO [train.py:1039] (2/4) Epoch 4, batch 3700, loss[loss=0.2771, simple_loss=0.3175, pruned_loss=0.1183, over 23797.00 frames. ], tot_loss[loss=0.2623, simple_loss=0.3171, pruned_loss=0.1037, over 4713892.54 frames. ], batch size: 179, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:19:34,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:19:37,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:39,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:19:42,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:42,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 20:19:42,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:42,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:19:44,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:19:47,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:19:51,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:51,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:53,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:19:53,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:53,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:19:55,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:56,693 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 20:20:05,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:20:07,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:20:07,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:20:07,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 20:20:08,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:10,661 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:20:11,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:13,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 20:20:15,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:16,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:20:16,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=131040.0, ans=0.125 2023-09-28 20:20:20,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:20,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:20:23,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:20:26,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:26,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 20:20:28,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:20:28,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 20:20:31,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=131106.66666666666, ans=0.125 2023-09-28 20:20:32,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:20:32,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:20:37,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:37,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 20:20:39,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:20:39,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:20:40,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:40,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:40,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=131173.33333333334, ans=0.125 2023-09-28 20:20:42,546 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:20:44,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:45,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 20:20:45,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 20:20:47,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:20:47,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:20:48,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:20:50,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:20:53,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:55,689 INFO [train.py:1039] (2/4) Epoch 4, batch 3750, loss[loss=0.2933, simple_loss=0.3463, pruned_loss=0.1201, over 23484.00 frames. ], tot_loss[loss=0.2651, simple_loss=0.319, pruned_loss=0.1056, over 4709080.47 frames. ], batch size: 94, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:20:55,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:20:57,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:20:58,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.77 vs. limit=6.0 2023-09-28 20:20:59,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 20:21:00,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=131240.0, ans=0.1 2023-09-28 20:21:00,759 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.45 vs. limit=22.5 2023-09-28 20:21:01,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:21:04,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:21:04,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 20:21:06,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:21:07,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:11,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=131306.66666666666, ans=0.1 2023-09-28 20:21:12,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:17,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:21:18,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:21:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:21:22,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:23,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 20:21:23,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:24,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.17 vs. limit=15.0 2023-09-28 20:21:25,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:25,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:29,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 20:21:34,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 20:21:36,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:36,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:39,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:44,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:45,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:21:50,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 20:21:54,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:57,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:59,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:22:03,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:22:06,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:22:06,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:22:10,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:22:11,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:22:13,148 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.509e+02 2.927e+02 3.521e+02 5.743e+02, threshold=5.855e+02, percent-clipped=1.0 2023-09-28 20:22:13,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:22:16,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=131573.33333333334, ans=0.125 2023-09-28 20:22:17,744 INFO [train.py:1039] (2/4) Epoch 4, batch 3800, loss[loss=0.271, simple_loss=0.3326, pruned_loss=0.1047, over 23389.00 frames. ], tot_loss[loss=0.2635, simple_loss=0.3186, pruned_loss=0.1042, over 4717613.13 frames. ], batch size: 93, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:22:23,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:22:26,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:27,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:22:28,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 20:22:30,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:31,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:33,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:22:35,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:22:35,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:38,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:22:39,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:39,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:22:39,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:42,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 20:22:45,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 20:22:45,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:22:48,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:51,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:22:52,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:22:54,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:22:54,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:54,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=131706.66666666666, ans=0.125 2023-09-28 20:22:57,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:58,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:23:02,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:23:02,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 20:23:05,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:08,202 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.18 vs. limit=12.0 2023-09-28 20:23:12,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:14,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=131773.33333333334, ans=0.1 2023-09-28 20:23:17,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:23:19,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 20:23:22,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 20:23:24,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:24,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:25,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:27,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 20:23:30,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 20:23:30,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 20:23:31,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:33,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:37,318 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.01 vs. limit=12.0 2023-09-28 20:23:39,629 INFO [train.py:1039] (2/4) Epoch 4, batch 3850, loss[loss=0.2271, simple_loss=0.2906, pruned_loss=0.08186, over 14993.00 frames. ], tot_loss[loss=0.2618, simple_loss=0.3167, pruned_loss=0.1035, over 4703502.78 frames. ], batch size: 31, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:23:39,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:23:39,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:23:43,484 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.87 vs. limit=15.0 2023-09-28 20:23:45,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:23:45,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 20:23:47,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:23:47,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:51,936 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.67 vs. limit=15.0 2023-09-28 20:23:52,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:23:55,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:58,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:24:00,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 20:24:05,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:06,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:24:08,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:08,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=131973.33333333334, ans=0.0 2023-09-28 20:24:09,157 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.06 vs. limit=12.0 2023-09-28 20:24:09,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.14 vs. limit=6.0 2023-09-28 20:24:09,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:24:12,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:13,793 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.63 vs. limit=15.0 2023-09-28 20:24:14,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:24:14,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:14,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:24:15,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.07 vs. limit=10.0 2023-09-28 20:24:16,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:18,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:19,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:19,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:24:22,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 20:24:22,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 20:24:22,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:22,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:22,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=132040.0, ans=0.1 2023-09-28 20:24:25,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:25,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:25,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 20:24:28,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 20:24:31,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:33,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 20:24:33,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=132106.66666666666, ans=0.125 2023-09-28 20:24:34,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=132106.66666666666, ans=0.1 2023-09-28 20:24:36,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:24:41,355 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.47 vs. limit=22.5 2023-09-28 20:24:42,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:43,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:48,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:48,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 20:24:52,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 20:24:54,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:56,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:56,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=132173.33333333334, ans=0.125 2023-09-28 20:24:57,997 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.471e+02 2.833e+02 3.573e+02 5.682e+02, threshold=5.667e+02, percent-clipped=0.0 2023-09-28 20:24:59,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:24:59,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:25:01,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:25:01,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 20:25:02,726 INFO [train.py:1039] (2/4) Epoch 4, batch 3900, loss[loss=0.2425, simple_loss=0.3125, pruned_loss=0.08625, over 23986.00 frames. ], tot_loss[loss=0.2605, simple_loss=0.3154, pruned_loss=0.1028, over 4708042.01 frames. ], batch size: 80, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:25:02,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:25:04,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 20:25:04,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:04,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:06,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:25:07,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:09,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:25:09,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:09,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:25:09,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:09,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 20:25:09,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:13,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:14,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=132240.0, ans=0.1 2023-09-28 20:25:15,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:15,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:25:16,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:19,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:20,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:23,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:25:25,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 20:25:25,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:27,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 20:25:28,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:30,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 20:25:30,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 20:25:32,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=132306.66666666666, ans=0.125 2023-09-28 20:25:37,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:37,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:37,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:25:39,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:25:39,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=132373.33333333334, ans=0.0 2023-09-28 20:25:42,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:44,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:25:46,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:25:46,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:25:48,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:25:54,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:54,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:26:02,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.26 vs. limit=15.0 2023-09-28 20:26:02,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:26:05,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:26:07,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.53 vs. limit=12.0 2023-09-28 20:26:15,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:18,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:18,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 20:26:18,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 20:26:18,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:19,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 20:26:21,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:26:21,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 20:26:24,454 INFO [train.py:1039] (2/4) Epoch 4, batch 3950, loss[loss=0.2879, simple_loss=0.3271, pruned_loss=0.1244, over 23877.00 frames. ], tot_loss[loss=0.2601, simple_loss=0.3155, pruned_loss=0.1024, over 4706794.45 frames. ], batch size: 195, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:26:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:26:30,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 20:26:32,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:26:34,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:26:37,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:26:40,920 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.94 vs. limit=22.5 2023-09-28 20:26:42,382 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 20:26:43,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:43,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 20:26:44,017 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 20:26:45,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:47,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:48,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:26:48,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:50,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=132640.0, ans=6.0 2023-09-28 20:26:51,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 20:26:54,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:26:56,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:56,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:26:56,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:26:56,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:27:09,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:27:10,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:27:10,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=132706.66666666666, ans=0.2 2023-09-28 20:27:10,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=132706.66666666666, ans=0.0 2023-09-28 20:27:15,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 20:27:20,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=132773.33333333334, ans=0.0 2023-09-28 20:27:21,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 20:27:21,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 20:27:22,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:27:24,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:27:26,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=132773.33333333334, ans=0.125 2023-09-28 20:27:31,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:27:31,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:27:32,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:27:32,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:27:34,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 20:27:37,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:27:39,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:27:42,848 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.456e+02 2.836e+02 3.414e+02 5.372e+02, threshold=5.673e+02, percent-clipped=0.0 2023-09-28 20:27:43,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 20:27:48,133 INFO [train.py:1039] (2/4) Epoch 4, batch 4000, loss[loss=0.3591, simple_loss=0.3759, pruned_loss=0.1712, over 19416.00 frames. ], tot_loss[loss=0.2613, simple_loss=0.3164, pruned_loss=0.1031, over 4714421.86 frames. ], batch size: 388, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:27:48,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=132906.66666666666, ans=0.125 2023-09-28 20:27:49,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=132906.66666666666, ans=0.0 2023-09-28 20:27:51,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=132906.66666666666, ans=0.07 2023-09-28 20:27:56,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:04,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:07,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=132973.33333333334, ans=0.2 2023-09-28 20:28:08,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:10,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:28:10,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:10,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 20:28:11,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:28:11,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 20:28:13,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:28:13,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 20:28:14,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:18,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:28:18,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:28:18,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:28:18,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:18,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:28:21,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:28:21,662 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.17 vs. limit=15.0 2023-09-28 20:28:24,072 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 20:28:24,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:28:24,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:28,902 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 20:28:29,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:28:29,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:32,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=133040.0, ans=0.1 2023-09-28 20:28:37,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 20:28:38,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:40,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:28:41,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 20:28:43,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:28:43,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 20:28:43,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:28:44,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:44,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:28:46,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:28:46,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:28:47,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:50,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 20:28:51,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:51,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=133173.33333333334, ans=0.125 2023-09-28 20:28:53,092 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 20:28:56,739 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.73 vs. limit=22.5 2023-09-28 20:28:57,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:29:02,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:29:03,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=133173.33333333334, ans=0.0 2023-09-28 20:29:05,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:29:05,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:05,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:29:07,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:07,943 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.20 vs. limit=15.0 2023-09-28 20:29:10,079 INFO [train.py:1039] (2/4) Epoch 4, batch 4050, loss[loss=0.2882, simple_loss=0.3287, pruned_loss=0.1238, over 23798.00 frames. ], tot_loss[loss=0.2596, simple_loss=0.3155, pruned_loss=0.1019, over 4736520.87 frames. ], batch size: 212, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:29:13,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:14,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.40 vs. limit=15.0 2023-09-28 20:29:16,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:29:16,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 20:29:17,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:29:19,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:29:19,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:29:21,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:21,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:22,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=133240.0, ans=0.1 2023-09-28 20:29:25,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:26,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=133306.66666666666, ans=0.125 2023-09-28 20:29:30,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:29:30,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:29:31,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:29:31,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:29:32,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=133306.66666666666, ans=0.1 2023-09-28 20:29:38,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:40,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=133306.66666666666, ans=0.09899494936611666 2023-09-28 20:29:42,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:44,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 20:29:47,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 20:29:47,141 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 20:29:48,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:29:54,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 20:29:55,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:29:59,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:01,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=133440.0, ans=0.1 2023-09-28 20:30:04,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:30:04,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:30:05,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:08,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:30:13,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 20:30:13,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:30:14,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:16,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 20:30:21,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:28,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 20:30:29,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:30:29,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:30:30,988 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.888e+02 2.307e+02 2.673e+02 3.242e+02 5.499e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:30:31,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 20:30:32,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 20:30:32,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:35,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:30:36,460 INFO [train.py:1039] (2/4) Epoch 4, batch 4100, loss[loss=0.3057, simple_loss=0.3493, pruned_loss=0.131, over 23534.00 frames. ], tot_loss[loss=0.2611, simple_loss=0.3169, pruned_loss=0.1027, over 4719276.05 frames. ], batch size: 256, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:30:36,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:36,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:30:44,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 20:30:46,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 20:30:47,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 20:30:48,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 20:30:49,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:49,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:30:51,778 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 20:30:51,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=133640.0, ans=0.125 2023-09-28 20:30:54,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:30:55,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=133640.0, ans=0.1 2023-09-28 20:30:56,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:30:56,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:56,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:31:02,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:31:03,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:31:03,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:31:05,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 20:31:05,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:05,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:31:05,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:05,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:31:06,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 20:31:07,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=133706.66666666666, ans=0.125 2023-09-28 20:31:11,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:13,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 20:31:14,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:31:15,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=133706.66666666666, ans=0.0 2023-09-28 20:31:17,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:17,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 20:31:18,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:31:18,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:31:20,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:31:21,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 20:31:23,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:31:23,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=133773.33333333334, ans=0.1 2023-09-28 20:31:24,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:31:25,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 20:31:27,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:27,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:31:30,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:34,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:31:39,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:39,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:31:50,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:31:50,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:50,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=133840.0, ans=0.0 2023-09-28 20:31:53,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:55,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:31:58,072 INFO [train.py:1039] (2/4) Epoch 4, batch 4150, loss[loss=0.2617, simple_loss=0.3273, pruned_loss=0.0981, over 23789.00 frames. ], tot_loss[loss=0.2609, simple_loss=0.3166, pruned_loss=0.1026, over 4724660.43 frames. ], batch size: 85, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:31:59,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:31:59,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:32:01,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:32:01,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:06,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 20:32:07,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:07,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 20:32:09,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 20:32:09,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 20:32:11,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:15,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:32:15,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:17,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=133973.33333333334, ans=0.5 2023-09-28 20:32:21,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:22,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:32:23,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:32:26,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:32:26,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:28,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:32:31,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:32,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=134040.0, ans=0.125 2023-09-28 20:32:34,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:35,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 20:32:39,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 20:32:39,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:32:39,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 20:32:39,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:32:39,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:32:42,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:32:43,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:45,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=134106.66666666666, ans=0.0 2023-09-28 20:32:48,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 20:32:51,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:32:54,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:32:56,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 20:32:56,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:58,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 20:32:59,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:33:01,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:33:02,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:02,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 20:33:02,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:02,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:33:04,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:33:06,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 20:33:06,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:06,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:33:06,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:33:07,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 20:33:09,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:33:09,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:33:10,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:33:12,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:13,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 20:33:13,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:33:16,747 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.811e+02 2.342e+02 2.661e+02 3.100e+02 4.687e+02, threshold=5.322e+02, percent-clipped=0.0 2023-09-28 20:33:18,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=134240.0, ans=0.2 2023-09-28 20:33:19,930 INFO [train.py:1039] (2/4) Epoch 4, batch 4200, loss[loss=0.2841, simple_loss=0.318, pruned_loss=0.1251, over 23799.00 frames. ], tot_loss[loss=0.2607, simple_loss=0.3156, pruned_loss=0.1029, over 4719908.05 frames. ], batch size: 212, lr: 2.32e-02, grad_scale: 16.0 2023-09-28 20:33:19,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:33:21,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 20:33:23,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:33:24,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:26,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:33:27,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:27,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:30,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 20:33:35,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 20:33:35,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:38,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:40,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:33:43,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:33:45,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:33:45,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:47,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 20:33:47,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:48,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:48,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:48,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:33:50,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:33:53,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 20:33:53,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:57,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:33:59,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:34:03,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:34:03,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:05,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:34:05,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 20:34:05,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:07,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:34:13,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:34:14,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:18,893 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.15 vs. limit=6.0 2023-09-28 20:34:21,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:34:24,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 20:34:24,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=134506.66666666666, ans=0.0 2023-09-28 20:34:27,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:31,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:34:32,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:35,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 20:34:42,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:34:43,567 INFO [train.py:1039] (2/4) Epoch 4, batch 4250, loss[loss=0.2183, simple_loss=0.2787, pruned_loss=0.07894, over 24507.00 frames. ], tot_loss[loss=0.2594, simple_loss=0.3143, pruned_loss=0.1022, over 4715137.30 frames. ], batch size: 58, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:34:45,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:45,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:34:48,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:51,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=134573.33333333334, ans=0.125 2023-09-28 20:34:53,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:34:53,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 20:34:53,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:56,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:59,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:04,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:06,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:08,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:35:08,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:10,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:11,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:13,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:15,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:35:15,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=134706.66666666666, ans=0.0 2023-09-28 20:35:15,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=134706.66666666666, ans=0.1 2023-09-28 20:35:18,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:19,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 20:35:22,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 20:35:22,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:24,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:24,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:26,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:35:26,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:26,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:26,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=134706.66666666666, ans=0.125 2023-09-28 20:35:31,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:35:32,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:35:33,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=134773.33333333334, ans=0.1 2023-09-28 20:35:35,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:35:37,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:38,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 20:35:38,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:35:40,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 20:35:42,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:35:44,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:35:46,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:46,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:48,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 20:35:49,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:35:51,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:35:54,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:55,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:58,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:36:00,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:02,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.342e+02 2.586e+02 3.220e+02 5.035e+02, threshold=5.173e+02, percent-clipped=0.0 2023-09-28 20:36:02,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:02,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:36:04,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:04,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 20:36:05,379 INFO [train.py:1039] (2/4) Epoch 4, batch 4300, loss[loss=0.2339, simple_loss=0.294, pruned_loss=0.0869, over 24412.00 frames. ], tot_loss[loss=0.259, simple_loss=0.3143, pruned_loss=0.1018, over 4709032.18 frames. ], batch size: 58, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:36:05,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:07,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=134906.66666666666, ans=0.125 2023-09-28 20:36:11,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:11,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:13,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=134906.66666666666, ans=0.125 2023-09-28 20:36:15,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:20,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=134973.33333333334, ans=0.125 2023-09-28 20:36:23,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:36:23,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 20:36:26,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:36:27,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:36:27,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:36:27,887 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 20:36:28,661 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=25.93 vs. limit=22.5 2023-09-28 20:36:31,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:36:32,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:36:37,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 20:36:37,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:36:39,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 20:36:41,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:36:42,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:36:43,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=135040.0, ans=0.0 2023-09-28 20:36:44,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:36:44,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:45,297 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.71 vs. limit=15.0 2023-09-28 20:36:45,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:36:47,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:49,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:49,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 20:36:49,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 20:36:53,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:56,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:56,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:36:56,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:57,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:57,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 20:36:57,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 20:36:58,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 20:36:59,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:36:59,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 20:36:59,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 20:37:05,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:07,174 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 20:37:07,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:37:08,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:08,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:09,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=135173.33333333334, ans=0.1 2023-09-28 20:37:10,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 20:37:10,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:37:10,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:12,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:12,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:14,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:37:16,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.94 vs. limit=15.0 2023-09-28 20:37:16,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:37:18,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:20,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:20,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:28,140 INFO [train.py:1039] (2/4) Epoch 4, batch 4350, loss[loss=0.2305, simple_loss=0.2914, pruned_loss=0.08477, over 24302.00 frames. ], tot_loss[loss=0.2595, simple_loss=0.3149, pruned_loss=0.102, over 4712805.69 frames. ], batch size: 56, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:37:28,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 20:37:28,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:37:30,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=135240.0, ans=0.0 2023-09-28 20:37:34,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:37:37,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:39,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:37:39,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:37:45,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:37:47,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:50,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:37:50,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:53,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:37:55,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:37:58,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:38:04,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 20:38:06,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:06,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:12,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:13,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 20:38:16,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:18,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:38:20,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=135440.0, ans=0.1 2023-09-28 20:38:23,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=135440.0, ans=0.0 2023-09-28 20:38:24,869 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 20:38:26,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:26,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:38:27,933 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 20:38:29,415 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 20:38:29,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:29,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:29,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:38:29,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:31,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:31,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:38:35,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 20:38:35,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:35,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:35,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:37,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 20:38:39,150 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 20:38:39,168 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 20:38:39,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 20:38:42,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:38:42,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:38:43,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:38:43,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:38:46,861 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.300e+02 2.585e+02 3.110e+02 4.848e+02, threshold=5.170e+02, percent-clipped=0.0 2023-09-28 20:38:47,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 20:38:48,668 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 20:38:48,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:50,005 INFO [train.py:1039] (2/4) Epoch 4, batch 4400, loss[loss=0.2815, simple_loss=0.3253, pruned_loss=0.1189, over 23790.00 frames. ], tot_loss[loss=0.2611, simple_loss=0.3165, pruned_loss=0.1029, over 4720005.63 frames. ], batch size: 212, lr: 2.31e-02, grad_scale: 32.0 2023-09-28 20:38:53,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:38:53,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:56,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:59,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 20:38:59,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 20:38:59,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 20:38:59,902 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 20:39:01,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:39:01,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:39:03,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 20:39:04,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:06,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:06,853 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 20:39:11,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:11,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 20:39:12,028 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 20:39:12,607 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=12.0 2023-09-28 20:39:15,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 20:39:15,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 20:39:15,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 20:39:15,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:17,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:17,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:19,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:20,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 20:39:20,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 20:39:22,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:25,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:39:25,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:26,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:28,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:28,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 20:39:29,625 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 20:39:30,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=135706.66666666666, ans=0.125 2023-09-28 20:39:33,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:35,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=135706.66666666666, ans=0.125 2023-09-28 20:39:39,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:41,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 20:39:42,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=135773.33333333334, ans=0.125 2023-09-28 20:39:45,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:39:45,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=135773.33333333334, ans=0.0 2023-09-28 20:39:50,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:39:51,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:39:51,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 20:39:51,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:39:51,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:39:51,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:39:53,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:39:58,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 20:40:01,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 20:40:02,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 20:40:02,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:02,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 20:40:04,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:40:05,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:40:08,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 20:40:11,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:40:13,239 INFO [train.py:1039] (2/4) Epoch 4, batch 4450, loss[loss=0.3713, simple_loss=0.3827, pruned_loss=0.1799, over 19697.00 frames. ], tot_loss[loss=0.2632, simple_loss=0.318, pruned_loss=0.1042, over 4701072.56 frames. ], batch size: 388, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:40:16,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:16,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:40:16,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=135906.66666666666, ans=0.1 2023-09-28 20:40:26,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:40:26,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:40:31,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:31,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:40:33,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:40:34,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:36,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 20:40:36,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:38,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:38,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:40:38,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:40:39,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:40:44,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:47,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:49,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:40:55,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:40:57,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 20:40:57,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 20:40:57,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:41:01,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:03,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 20:41:03,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=136106.66666666666, ans=0.125 2023-09-28 20:41:06,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:41:09,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:09,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 20:41:09,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:09,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:09,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:41:09,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:12,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:15,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:41:17,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 20:41:19,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:41:20,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:41:22,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:24,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:25,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:41:26,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:41:30,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=136173.33333333334, ans=0.0 2023-09-28 20:41:31,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 20:41:32,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.28 vs. limit=10.0 2023-09-28 20:41:32,908 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.347e+02 2.673e+02 3.318e+02 4.703e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:41:33,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:41:33,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=136173.33333333334, ans=0.125 2023-09-28 20:41:35,963 INFO [train.py:1039] (2/4) Epoch 4, batch 4500, loss[loss=0.3059, simple_loss=0.3345, pruned_loss=0.1386, over 20071.00 frames. ], tot_loss[loss=0.2642, simple_loss=0.3189, pruned_loss=0.1047, over 4703635.18 frames. ], batch size: 388, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:41:36,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=136240.0, ans=0.125 2023-09-28 20:41:39,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:40,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 20:41:40,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 20:41:42,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:41:45,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:47,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:47,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:41:48,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:41:48,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:41:48,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:41:52,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=136306.66666666666, ans=0.0 2023-09-28 20:42:02,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:42:04,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:42:07,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:08,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:42:08,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:42:15,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:42:20,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:42:24,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:42:27,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:42:27,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 20:42:29,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:29,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:34,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:42:34,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 20:42:34,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:42:34,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:39,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:42:39,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:42:43,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:44,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:42:46,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:42:47,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 20:42:50,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 20:42:50,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 20:42:55,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 20:42:55,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=136573.33333333334, ans=0.0 2023-09-28 20:42:56,790 INFO [train.py:1039] (2/4) Epoch 4, batch 4550, loss[loss=0.2851, simple_loss=0.3223, pruned_loss=0.1239, over 23816.00 frames. ], tot_loss[loss=0.2625, simple_loss=0.3176, pruned_loss=0.1037, over 4688083.33 frames. ], batch size: 179, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:42:56,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 20:42:59,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:03,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:04,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:09,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:12,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:43:14,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:43:16,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:16,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:43:16,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:20,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:20,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:23,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:43:26,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 20:43:28,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 20:43:29,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:43:29,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 20:43:32,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 20:43:34,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:37,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 20:43:37,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=136706.66666666666, ans=0.125 2023-09-28 20:43:39,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:43:44,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:43:46,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 20:43:49,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:43:49,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=136773.33333333334, ans=0.125 2023-09-28 20:43:51,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:51,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:52,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:54,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 20:43:55,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 20:43:56,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:43:56,750 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.26 vs. limit=12.0 2023-09-28 20:43:57,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 20:44:00,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 20:44:00,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:44:00,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:02,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:02,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:02,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:44:05,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:44:05,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 20:44:06,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:44:06,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:44:08,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 20:44:08,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:44:08,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 20:44:11,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:44:11,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:44:15,289 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.285e+02 2.509e+02 2.934e+02 4.311e+02, threshold=5.019e+02, percent-clipped=0.0 2023-09-28 20:44:15,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:44:15,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:17,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:44:19,149 INFO [train.py:1039] (2/4) Epoch 4, batch 4600, loss[loss=0.2499, simple_loss=0.3045, pruned_loss=0.09765, over 23483.00 frames. ], tot_loss[loss=0.2604, simple_loss=0.3154, pruned_loss=0.1027, over 4676768.81 frames. ], batch size: 106, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:44:19,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:44:20,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:44:22,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:23,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:26,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:44:26,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:44:27,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:28,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 20:44:30,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:44:34,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:44:36,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:37,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:37,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=136973.33333333334, ans=0.1 2023-09-28 20:44:37,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=136973.33333333334, ans=0.125 2023-09-28 20:44:38,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.32 vs. limit=15.0 2023-09-28 20:44:45,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 20:44:47,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:50,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:54,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:44:54,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:58,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 20:44:58,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:44:59,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:03,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:03,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:45:05,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:45:09,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 20:45:11,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:45:14,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:16,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:45:21,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:21,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 20:45:21,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:21,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=137106.66666666666, ans=10.0 2023-09-28 20:45:22,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 20:45:22,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:23,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:25,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:25,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:45:25,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=137173.33333333334, ans=0.125 2023-09-28 20:45:27,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:28,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 20:45:28,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 20:45:28,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 20:45:28,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:28,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=137173.33333333334, ans=0.2 2023-09-28 20:45:31,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:31,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:33,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:39,405 INFO [train.py:1039] (2/4) Epoch 4, batch 4650, loss[loss=0.2696, simple_loss=0.3195, pruned_loss=0.1099, over 23822.00 frames. ], tot_loss[loss=0.2591, simple_loss=0.3141, pruned_loss=0.1021, over 4689060.02 frames. ], batch size: 195, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:45:42,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:45:45,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:45,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:45,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:45:45,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:47,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:49,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:53,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 20:45:57,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:45:59,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 20:45:59,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:59,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 20:46:00,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:46:02,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 20:46:02,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 20:46:02,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:02,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:46:08,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:46:08,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:09,917 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 20:46:13,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:13,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 20:46:16,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:16,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:46:17,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 20:46:19,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=137373.33333333334, ans=0.1 2023-09-28 20:46:20,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:46:23,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:46:25,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=137373.33333333334, ans=10.0 2023-09-28 20:46:28,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:32,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:35,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:35,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:37,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:46:38,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 20:46:40,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 20:46:40,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 20:46:40,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 20:46:42,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:46:45,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=137506.66666666666, ans=0.2 2023-09-28 20:46:49,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:46:49,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:46:49,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 20:46:50,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:50,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=137506.66666666666, ans=0.125 2023-09-28 20:46:51,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:51,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:46:53,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:46:56,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:46:56,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:57,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:58,916 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.187e+02 2.531e+02 2.951e+02 4.992e+02, threshold=5.061e+02, percent-clipped=0.0 2023-09-28 20:47:00,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:02,483 INFO [train.py:1039] (2/4) Epoch 4, batch 4700, loss[loss=0.3042, simple_loss=0.3498, pruned_loss=0.1293, over 23835.00 frames. ], tot_loss[loss=0.2609, simple_loss=0.3154, pruned_loss=0.1032, over 4685170.21 frames. ], batch size: 195, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:47:02,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:47:02,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:47:02,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 20:47:04,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:47:06,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 20:47:07,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=137573.33333333334, ans=0.125 2023-09-28 20:47:14,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:14,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:16,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:47:17,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:19,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:47:25,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 20:47:25,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 20:47:27,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:28,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:47:28,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:47:32,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:36,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=137706.66666666666, ans=0.125 2023-09-28 20:47:39,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:47:42,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:47:45,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:51,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 20:47:52,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:47:54,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:47:57,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 20:47:58,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:03,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:48:05,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 20:48:07,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:07,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:10,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:48:12,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:48:12,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 20:48:12,255 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 20:48:15,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:17,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 20:48:19,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:22,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 20:48:22,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=137840.0, ans=0.125 2023-09-28 20:48:25,434 INFO [train.py:1039] (2/4) Epoch 4, batch 4750, loss[loss=0.2753, simple_loss=0.3248, pruned_loss=0.1129, over 23777.00 frames. ], tot_loss[loss=0.2612, simple_loss=0.3162, pruned_loss=0.1031, over 4701028.93 frames. ], batch size: 179, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:48:25,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:48:27,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:48:33,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 20:48:33,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:48:36,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 20:48:38,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:48:39,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:39,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:46,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 20:48:51,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:48:54,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 20:48:54,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:58,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.44 vs. limit=10.0 2023-09-28 20:48:59,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:59,425 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 20:48:59,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 20:49:05,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 20:49:08,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:10,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:11,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:49:11,799 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 20:49:11,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:15,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:49:18,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:49:18,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 20:49:20,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 20:49:20,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:49:21,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:49:21,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:22,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:49:24,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 20:49:27,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 20:49:29,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:49:30,240 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.95 vs. limit=22.5 2023-09-28 20:49:32,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:49:32,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 20:49:33,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:49:34,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:36,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:49:37,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:38,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:49:41,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=138173.33333333334, ans=0.125 2023-09-28 20:49:43,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:43,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 20:49:44,415 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 2.351e+02 2.944e+02 3.482e+02 5.215e+02, threshold=5.888e+02, percent-clipped=1.0 2023-09-28 20:49:44,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 20:49:46,109 INFO [train.py:1039] (2/4) Epoch 4, batch 4800, loss[loss=0.284, simple_loss=0.3235, pruned_loss=0.1222, over 23799.00 frames. ], tot_loss[loss=0.2608, simple_loss=0.316, pruned_loss=0.1028, over 4707874.52 frames. ], batch size: 179, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:49:46,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 20:49:47,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:49:49,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:51,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 20:49:56,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:58,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:01,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=138306.66666666666, ans=0.125 2023-09-28 20:50:05,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:50:05,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:05,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:06,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 20:50:08,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:50:08,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:50:09,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:50:10,648 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.60 vs. limit=12.0 2023-09-28 20:50:13,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:14,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:16,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:50:17,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=138373.33333333334, ans=0.1 2023-09-28 20:50:19,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:19,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:50:19,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:19,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:22,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:22,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=138373.33333333334, ans=0.125 2023-09-28 20:50:25,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:50:27,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:50:29,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:29,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=138373.33333333334, ans=0.125 2023-09-28 20:50:30,086 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.80 vs. limit=10.0 2023-09-28 20:50:33,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 20:50:33,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 20:50:34,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:34,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:50:35,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:50:35,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:35,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:50:36,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:50:36,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=138440.0, ans=0.125 2023-09-28 20:50:37,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:41,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:44,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:46,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=138440.0, ans=0.09899494936611666 2023-09-28 20:50:47,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:50:49,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=138440.0, ans=0.125 2023-09-28 20:50:50,123 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.01 vs. limit=6.0 2023-09-28 20:50:50,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 20:50:51,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:52,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:52,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:50:53,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:58,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:58,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:50:58,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:58,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:51:00,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:51:00,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:51:04,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:04,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:04,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:51:06,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 20:51:08,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.64 vs. limit=15.0 2023-09-28 20:51:09,196 INFO [train.py:1039] (2/4) Epoch 4, batch 4850, loss[loss=0.2655, simple_loss=0.3075, pruned_loss=0.1118, over 23653.00 frames. ], tot_loss[loss=0.2627, simple_loss=0.3172, pruned_loss=0.1041, over 4700420.58 frames. ], batch size: 256, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:51:09,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 20:51:09,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:09,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:10,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:10,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:11,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=138573.33333333334, ans=0.125 2023-09-28 20:51:11,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.47 vs. limit=22.5 2023-09-28 20:51:14,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:51:16,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=138573.33333333334, ans=0.125 2023-09-28 20:51:21,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 20:51:23,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:26,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:28,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:51:28,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:31,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:32,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=138640.0, ans=0.0 2023-09-28 20:51:34,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:51:36,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:51:36,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 20:51:41,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:42,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=138706.66666666666, ans=0.2 2023-09-28 20:51:43,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:51:43,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:51:44,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:51:45,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 20:51:48,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:48,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:52,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=138706.66666666666, ans=0.2 2023-09-28 20:51:52,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.57 vs. limit=15.0 2023-09-28 20:51:53,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:53,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 20:51:54,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 20:51:54,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:51:58,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.20 vs. limit=15.0 2023-09-28 20:52:02,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:52:02,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 20:52:02,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:52:03,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=138773.33333333334, ans=0.125 2023-09-28 20:52:04,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:52:06,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:52:08,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 20:52:08,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:09,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 20:52:09,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:09,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:11,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 20:52:14,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=138840.0, ans=0.0 2023-09-28 20:52:14,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=138840.0, ans=0.125 2023-09-28 20:52:20,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:27,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:52:27,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:29,910 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.911e+02 2.282e+02 2.559e+02 2.982e+02 4.179e+02, threshold=5.119e+02, percent-clipped=0.0 2023-09-28 20:52:31,428 INFO [train.py:1039] (2/4) Epoch 4, batch 4900, loss[loss=0.2584, simple_loss=0.2808, pruned_loss=0.118, over 19486.00 frames. ], tot_loss[loss=0.2612, simple_loss=0.3156, pruned_loss=0.1034, over 4686469.42 frames. ], batch size: 388, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:52:33,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 20:52:33,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:52:38,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=138906.66666666666, ans=0.0 2023-09-28 20:52:41,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:42,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:42,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:52:45,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 20:52:49,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 20:52:50,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=138973.33333333334, ans=0.05 2023-09-28 20:52:50,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=138973.33333333334, ans=0.2 2023-09-28 20:52:50,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=138973.33333333334, ans=0.0 2023-09-28 20:52:54,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 20:52:56,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 20:52:56,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:52:56,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:56,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:52:58,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:58,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:52:58,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 20:53:00,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=138973.33333333334, ans=0.125 2023-09-28 20:53:02,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 20:53:02,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:53:04,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:53:06,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:53:07,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:53:08,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:09,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:09,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 20:53:11,288 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.43 vs. limit=15.0 2023-09-28 20:53:11,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:53:12,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:53:13,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 20:53:13,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 20:53:13,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=139040.0, ans=0.125 2023-09-28 20:53:17,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 20:53:19,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:53:23,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:53:23,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:53:23,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:23,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:53:25,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:53:25,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 20:53:28,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:29,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:53:31,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:53:35,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 20:53:35,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:53:35,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:53:36,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 20:53:43,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:44,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:53:45,660 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.20 vs. limit=15.0 2023-09-28 20:53:46,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 20:53:46,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:53:47,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:53:48,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=139173.33333333334, ans=0.1 2023-09-28 20:53:49,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:52,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:53:52,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:53:52,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:54,528 INFO [train.py:1039] (2/4) Epoch 4, batch 4950, loss[loss=0.2838, simple_loss=0.3361, pruned_loss=0.1157, over 23552.00 frames. ], tot_loss[loss=0.26, simple_loss=0.3141, pruned_loss=0.1029, over 4688567.06 frames. ], batch size: 94, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:53:54,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:53:56,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:53:59,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:53:59,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:54:02,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 20:54:02,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 20:54:02,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:54:03,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=139240.0, ans=0.2 2023-09-28 20:54:06,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 20:54:06,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:06,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:54:07,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:54:07,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:09,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:11,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:54:12,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:54:14,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:54:14,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=139306.66666666666, ans=0.125 2023-09-28 20:54:14,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=139306.66666666666, ans=0.0 2023-09-28 20:54:15,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:15,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:54:19,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:54:25,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:26,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:54:28,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:28,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:28,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=139373.33333333334, ans=0.1 2023-09-28 20:54:32,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:54:33,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 20:54:33,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=139373.33333333334, ans=0.015 2023-09-28 20:54:35,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 20:54:35,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:39,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:54:39,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:54:42,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:54:42,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:54:42,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=139440.0, ans=6.0 2023-09-28 20:54:43,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:54:45,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:46,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:54:48,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:54:48,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=139440.0, ans=0.0 2023-09-28 20:54:50,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:51,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:51,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 20:54:53,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:54:53,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:54:57,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:54:58,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:54:58,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:55:00,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:00,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:55:00,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:55:00,404 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:55:03,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:55:03,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:55:05,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:55:05,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 20:55:10,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:11,444 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=25.53 vs. limit=22.5 2023-09-28 20:55:15,178 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.343e+02 2.664e+02 3.186e+02 5.232e+02, threshold=5.328e+02, percent-clipped=1.0 2023-09-28 20:55:16,767 INFO [train.py:1039] (2/4) Epoch 4, batch 5000, loss[loss=0.2741, simple_loss=0.3144, pruned_loss=0.117, over 23826.00 frames. ], tot_loss[loss=0.2577, simple_loss=0.3127, pruned_loss=0.1013, over 4704763.10 frames. ], batch size: 212, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:55:16,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 20:55:16,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:55:19,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=139573.33333333334, ans=0.2 2023-09-28 20:55:21,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.78 vs. limit=15.0 2023-09-28 20:55:21,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:21,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:22,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=139573.33333333334, ans=0.0 2023-09-28 20:55:23,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 20:55:24,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 20:55:26,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:55:28,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 20:55:28,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:55:28,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:55:30,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 20:55:30,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:31,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:55:33,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 20:55:33,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:33,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:55:35,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 20:55:36,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 20:55:36,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:55:36,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 20:55:36,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:55:37,510 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.04 vs. limit=15.0 2023-09-28 20:55:38,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:38,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:55:38,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 20:55:38,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 20:55:38,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=139640.0, ans=0.125 2023-09-28 20:55:40,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 20:55:40,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:42,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:42,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 20:55:43,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:45,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:45,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:48,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:55:49,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 20:55:51,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:55:54,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:55:57,592 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 20:55:59,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:56:02,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:56:02,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:04,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 20:56:05,164 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.70 vs. limit=12.0 2023-09-28 20:56:05,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:56:06,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:06,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:07,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:56:09,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:13,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:13,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:20,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 20:56:24,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:26,979 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=7.05 vs. limit=12.0 2023-09-28 20:56:28,132 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:56:32,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:34,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:34,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:56:34,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:36,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:56:36,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:56:37,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:38,964 INFO [train.py:1039] (2/4) Epoch 4, batch 5050, loss[loss=0.2558, simple_loss=0.3235, pruned_loss=0.09404, over 24016.00 frames. ], tot_loss[loss=0.2577, simple_loss=0.313, pruned_loss=0.1012, over 4707970.04 frames. ], batch size: 80, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:56:42,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:42,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 20:56:44,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:56:47,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:49,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:56:51,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 20:56:53,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:53,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:54,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:56:56,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:56:57,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:56:59,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=139973.33333333334, ans=0.125 2023-09-28 20:57:05,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 20:57:05,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:57:07,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:07,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 20:57:07,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:10,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:10,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:57:10,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:57:10,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 20:57:10,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=140040.0, ans=0.0 2023-09-28 20:57:12,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 20:57:14,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:19,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:19,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=140040.0, ans=0.0 2023-09-28 20:57:22,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:22,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 20:57:24,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:28,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 20:57:29,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:57:30,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:57:30,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:30,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:31,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:57:34,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:57:34,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:34,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:57:34,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:57:36,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 20:57:36,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:57:39,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:43,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:43,864 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 20:57:43,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:57:45,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:57:45,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:45,577 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 20:57:49,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:49,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 20:57:49,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:54,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 20:57:56,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 20:57:59,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:00,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:00,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:58:00,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=140173.33333333334, ans=0.0 2023-09-28 20:58:02,020 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.333e+02 2.668e+02 3.236e+02 5.838e+02, threshold=5.336e+02, percent-clipped=1.0 2023-09-28 20:58:03,587 INFO [train.py:1039] (2/4) Epoch 4, batch 5100, loss[loss=0.2538, simple_loss=0.3042, pruned_loss=0.1017, over 23759.00 frames. ], tot_loss[loss=0.2587, simple_loss=0.314, pruned_loss=0.1017, over 4706310.15 frames. ], batch size: 149, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:58:05,147 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 20:58:08,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:58:11,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 20:58:11,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 20:58:12,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:14,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:58:18,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:58:18,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 20:58:20,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 20:58:25,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:58:25,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:58:28,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:34,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 20:58:34,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:36,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:58:36,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:58:37,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:39,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:39,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 20:58:41,463 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 20:58:41,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:42,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 20:58:43,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 20:58:46,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:55,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:58:58,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 20:58:58,136 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 20:58:58,159 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 20:59:01,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 20:59:01,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:59:06,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 20:59:10,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 20:59:12,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:59:14,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:59:16,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 20:59:17,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:59:19,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 20:59:24,355 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.55 vs. limit=15.0 2023-09-28 20:59:24,967 INFO [train.py:1039] (2/4) Epoch 4, batch 5150, loss[loss=0.2922, simple_loss=0.3515, pruned_loss=0.1164, over 24379.00 frames. ], tot_loss[loss=0.2592, simple_loss=0.3145, pruned_loss=0.102, over 4714531.91 frames. ], batch size: 77, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:59:25,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:59:25,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:59:25,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:59:26,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:59:26,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:59:28,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:59:28,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 20:59:28,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 20:59:29,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 20:59:29,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:59:29,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 20:59:31,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:32,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:59:35,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:36,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:41,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:59:41,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 20:59:43,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:44,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:59:44,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=140640.0, ans=0.125 2023-09-28 20:59:45,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:59:45,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:59:46,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:59:47,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:59:47,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:59:47,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 20:59:49,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:59:50,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:59:50,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=140640.0, ans=0.5 2023-09-28 20:59:52,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:59:53,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 20:59:55,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:59:57,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=140706.66666666666, ans=0.0 2023-09-28 21:00:00,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=140706.66666666666, ans=0.125 2023-09-28 21:00:01,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:00:04,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 21:00:07,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:00:13,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=140773.33333333334, ans=0.0 2023-09-28 21:00:14,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:18,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:23,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:23,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:27,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 21:00:31,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:00:31,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:00:33,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:00:36,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:36,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=140840.0, ans=0.2 2023-09-28 21:00:37,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:37,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 21:00:42,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:42,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:00:45,323 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.271e+02 2.546e+02 2.924e+02 4.595e+02, threshold=5.092e+02, percent-clipped=0.0 2023-09-28 21:00:45,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:45,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:00:47,426 INFO [train.py:1039] (2/4) Epoch 4, batch 5200, loss[loss=0.2352, simple_loss=0.2982, pruned_loss=0.08613, over 24475.00 frames. ], tot_loss[loss=0.2599, simple_loss=0.3155, pruned_loss=0.1021, over 4704169.27 frames. ], batch size: 58, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:00:47,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:00:47,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:00:47,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=140906.66666666666, ans=0.1 2023-09-28 21:00:49,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:00:49,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:00:50,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:00:54,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:00:57,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:02,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 21:01:02,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:01:02,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:05,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:07,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:01:07,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:08,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 21:01:11,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:01:12,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:15,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 21:01:16,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:01:18,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:01:20,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 21:01:21,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 21:01:23,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 21:01:23,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:23,484 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 21:01:24,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:25,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:25,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:01:27,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 21:01:28,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:01:30,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:33,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 21:01:33,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 21:01:35,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 21:01:38,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 21:01:38,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:01:43,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:01:43,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:01:45,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 21:01:45,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:46,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:01:46,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:46,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:01:50,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:01:53,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:01:55,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:56,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:01:56,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:03,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:04,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 21:02:05,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:02:05,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:02:08,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:09,653 INFO [train.py:1039] (2/4) Epoch 4, batch 5250, loss[loss=0.2647, simple_loss=0.3011, pruned_loss=0.1141, over 23745.00 frames. ], tot_loss[loss=0.2593, simple_loss=0.3148, pruned_loss=0.1019, over 4711919.70 frames. ], batch size: 212, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:02:09,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:02:09,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:02:11,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:02:16,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:17,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:02:18,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:02:23,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:25,119 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.99 vs. limit=15.0 2023-09-28 21:02:25,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:02:28,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:02:30,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:02:32,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 21:02:32,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:32,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=141306.66666666666, ans=0.125 2023-09-28 21:02:34,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:35,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.48 vs. limit=6.0 2023-09-28 21:02:41,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=141373.33333333334, ans=0.1 2023-09-28 21:03:19,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=141506.66666666666, ans=0.04949747468305833 2023-09-28 21:03:21,503 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.854e+02 2.354e+02 2.746e+02 3.335e+02 6.410e+02, threshold=5.493e+02, percent-clipped=2.0 2023-09-28 21:03:22,902 INFO [train.py:1039] (2/4) Epoch 4, batch 5300, loss[loss=0.2333, simple_loss=0.2891, pruned_loss=0.08878, over 24437.00 frames. ], tot_loss[loss=0.2581, simple_loss=0.3138, pruned_loss=0.1012, over 4715642.66 frames. ], batch size: 58, lr: 2.26e-02, grad_scale: 32.0 2023-09-28 21:03:36,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=141640.0, ans=0.125 2023-09-28 21:03:38,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:03:38,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 21:03:38,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 21:03:38,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:38,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:38,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:39,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:39,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:39,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:03:39,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:39,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:03:39,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:03:39,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 21:03:40,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 21:03:40,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 21:03:40,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:03:40,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 21:03:40,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 21:03:40,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:41,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:41,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:41,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:41,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:03:42,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:42,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:42,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:42,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:42,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:42,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:03:42,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:42,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:03:43,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 21:03:43,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:44,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:44,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 21:03:44,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 21:03:44,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:03:44,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:03:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 21:03:44,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 21:03:44,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:45,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:03:45,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:46,049 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 21:03:46,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 21:03:46,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:03:46,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:46,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 21:03:46,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 21:03:46,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 21:03:46,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:55,445 INFO [train.py:1039] (2/4) Epoch 5, batch 0, loss[loss=0.2674, simple_loss=0.3196, pruned_loss=0.1076, over 23556.00 frames. ], tot_loss[loss=0.2674, simple_loss=0.3196, pruned_loss=0.1076, over 23556.00 frames. ], batch size: 256, lr: 2.11e-02, grad_scale: 32.0 2023-09-28 21:03:55,446 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 21:04:10,256 INFO [train.py:1071] (2/4) Epoch 5, validation: loss=0.3547, simple_loss=0.3281, pruned_loss=0.1907, over 1125622.00 frames. 2023-09-28 21:04:10,258 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 21:04:10,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 21:04:12,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:04:14,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:04:19,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:19,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:04:20,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:21,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 21:04:23,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 21:04:23,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=141653.33333333334, ans=0.0 2023-09-28 21:04:25,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:26,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:32,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:04:32,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:32,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 21:04:34,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=141720.0, ans=0.125 2023-09-28 21:04:35,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:43,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=141786.66666666666, ans=0.125 2023-09-28 21:04:46,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:04:46,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:48,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 21:04:52,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:04:52,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:04:53,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:04:58,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:05:02,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:04,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=141853.33333333334, ans=0.125 2023-09-28 21:05:07,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 21:05:10,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 21:05:10,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:10,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:11,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:05:11,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:05:13,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 21:05:17,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:19,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:23,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:05:27,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 21:05:28,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:05:31,967 INFO [train.py:1039] (2/4) Epoch 5, batch 50, loss[loss=0.2341, simple_loss=0.2932, pruned_loss=0.08753, over 23457.00 frames. ], tot_loss[loss=0.2561, simple_loss=0.3137, pruned_loss=0.09926, over 1079707.87 frames. ], batch size: 134, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:05:32,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:35,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:36,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 21:05:36,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:05:36,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:05:39,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:42,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:45,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:48,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 21:05:48,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:55,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:05:57,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 21:05:59,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 21:06:02,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:06:02,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:02,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:02,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:04,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=142120.0, ans=0.125 2023-09-28 21:06:05,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:06:05,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:06:05,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:12,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:13,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:14,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:06:15,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 21:06:18,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:06:19,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:06:19,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 21:06:20,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:22,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 21:06:28,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=142186.66666666666, ans=0.125 2023-09-28 21:06:29,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:06:29,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:31,347 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.771e+02 2.197e+02 2.413e+02 2.834e+02 4.473e+02, threshold=4.826e+02, percent-clipped=0.0 2023-09-28 21:06:31,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:31,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=142186.66666666666, ans=0.125 2023-09-28 21:06:33,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:33,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:36,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 21:06:36,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 21:06:38,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:39,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:41,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:42,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:42,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 21:06:43,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=142253.33333333334, ans=6.0 2023-09-28 21:06:44,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 21:06:44,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 21:06:47,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:47,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:06:47,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 21:06:48,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 21:06:50,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:50,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:52,033 INFO [train.py:1039] (2/4) Epoch 5, batch 100, loss[loss=0.3032, simple_loss=0.3439, pruned_loss=0.1313, over 23419.00 frames. ], tot_loss[loss=0.2558, simple_loss=0.3134, pruned_loss=0.09914, over 1885451.39 frames. ], batch size: 285, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:06:52,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:06:53,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:06:55,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:06:57,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:07:01,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:05,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 21:07:05,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:07:12,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:07:12,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:12,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:07:12,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:07:12,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:12,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=142386.66666666666, ans=0.125 2023-09-28 21:07:15,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 21:07:17,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:07:17,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:17,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:17,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:22,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 21:07:22,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:23,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:23,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:07:24,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=142453.33333333334, ans=0.125 2023-09-28 21:07:26,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:07:30,000 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 21:07:30,036 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 21:07:30,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.77 vs. limit=15.0 2023-09-28 21:07:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:07:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:07:36,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:07:39,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:39,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:47,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:47,446 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 21:07:49,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:07:52,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:07:52,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:07:52,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=142520.0, ans=0.125 2023-09-28 21:07:53,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:54,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=142520.0, ans=0.125 2023-09-28 21:07:58,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:01,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:03,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:08:06,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:09,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:09,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:08:09,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:09,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 21:08:11,191 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 21:08:11,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:11,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:08:12,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:12,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:12,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 21:08:12,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:08:14,818 INFO [train.py:1039] (2/4) Epoch 5, batch 150, loss[loss=0.2371, simple_loss=0.317, pruned_loss=0.07857, over 24665.00 frames. ], tot_loss[loss=0.2587, simple_loss=0.3159, pruned_loss=0.1007, over 2513137.35 frames. ], batch size: 73, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:08:14,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:08:14,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:15,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:16,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:16,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:08:18,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:08:18,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:23,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:23,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:23,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:26,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:28,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:29,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:08:30,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=142720.0, ans=0.0 2023-09-28 21:08:31,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:35,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 21:08:35,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 21:08:35,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 21:08:37,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.55 vs. limit=22.5 2023-09-28 21:08:38,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:08:38,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:08:39,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=142720.0, ans=0.125 2023-09-28 21:08:40,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:08:42,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:42,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:42,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:42,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=142720.0, ans=0.0 2023-09-28 21:08:43,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:43,969 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 21:08:44,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=142720.0, ans=0.125 2023-09-28 21:08:47,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:52,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:56,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:08:57,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 21:09:00,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:09:01,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:09:02,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:04,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=142853.33333333334, ans=0.0 2023-09-28 21:09:05,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:09:07,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:09:08,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:09:08,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:08,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 21:09:10,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=142853.33333333334, ans=0.125 2023-09-28 21:09:12,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=142853.33333333334, ans=0.125 2023-09-28 21:09:14,563 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.253e+02 2.610e+02 3.187e+02 7.657e+02, threshold=5.219e+02, percent-clipped=8.0 2023-09-28 21:09:14,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:14,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:14,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:09:14,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:09:18,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:19,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 21:09:22,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:09:23,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:09:25,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:27,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:09:27,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 21:09:29,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:29,068 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 21:09:32,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:32,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.49 vs. limit=12.0 2023-09-28 21:09:36,804 INFO [train.py:1039] (2/4) Epoch 5, batch 200, loss[loss=0.2511, simple_loss=0.3151, pruned_loss=0.09358, over 24074.00 frames. ], tot_loss[loss=0.2604, simple_loss=0.317, pruned_loss=0.1019, over 3000573.31 frames. ], batch size: 80, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:09:36,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:09:38,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:09:40,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 21:09:41,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:41,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:43,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 21:09:46,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:09:48,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:48,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:50,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=142986.66666666666, ans=0.125 2023-09-28 21:09:53,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:09:53,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:53,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:02,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=143053.33333333334, ans=0.125 2023-09-28 21:10:09,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=143120.0, ans=0.2 2023-09-28 21:10:12,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:10:13,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:10:15,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:10:16,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:10:17,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:10:17,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:10:20,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:22,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:10:22,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:22,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:24,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 21:10:25,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:10:25,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:29,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=143186.66666666666, ans=22.5 2023-09-28 21:10:31,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:10:34,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:43,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:43,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:10:52,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:54,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 21:10:55,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:55,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:10:55,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:57,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:10:59,058 INFO [train.py:1039] (2/4) Epoch 5, batch 250, loss[loss=0.2607, simple_loss=0.3009, pruned_loss=0.1103, over 23563.00 frames. ], tot_loss[loss=0.2587, simple_loss=0.3152, pruned_loss=0.1011, over 3377273.61 frames. ], batch size: 256, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:10:59,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 21:11:00,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:02,066 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 21:11:03,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:07,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:11:07,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:08,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:11:10,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:11:10,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:11,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:11:16,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:11:29,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:31,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:11:32,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:11:37,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:11:39,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:11:39,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:11:39,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:41,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:11:41,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:11:41,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:44,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:11:45,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.48 vs. limit=15.0 2023-09-28 21:11:47,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 21:11:47,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:49,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:11:49,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:11:49,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:11:51,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:11:53,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:11:53,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:11:56,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:57,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:11:57,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:11:59,958 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.236e+02 2.772e+02 3.274e+02 8.100e+02, threshold=5.544e+02, percent-clipped=4.0 2023-09-28 21:12:01,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:12:04,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:05,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=143586.66666666666, ans=0.0 2023-09-28 21:12:08,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:12:16,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:16,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:12:19,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 21:12:21,359 INFO [train.py:1039] (2/4) Epoch 5, batch 300, loss[loss=0.2676, simple_loss=0.3062, pruned_loss=0.1145, over 23753.00 frames. ], tot_loss[loss=0.2566, simple_loss=0.3125, pruned_loss=0.1004, over 3654214.89 frames. ], batch size: 179, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:12:21,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:12:21,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:12:23,683 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=12.0 2023-09-28 21:12:24,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 21:12:24,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:12:26,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:12:26,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 21:12:30,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:33,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:12:36,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:12:38,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 21:12:39,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:41,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:12:41,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 21:12:41,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:45,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:12:51,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:12:52,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 21:12:55,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 21:12:55,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:58,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:59,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:59,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 21:12:59,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:13:02,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:13:03,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:13:05,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:10,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:13:10,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 21:13:11,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:13:14,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:14,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 21:13:17,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:20,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:13:23,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:13:23,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 21:13:28,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:28,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:13:31,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:31,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:13:33,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 21:13:33,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:13:33,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:34,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 21:13:36,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:38,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:38,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:38,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:40,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:44,716 INFO [train.py:1039] (2/4) Epoch 5, batch 350, loss[loss=0.2495, simple_loss=0.2974, pruned_loss=0.1007, over 23884.00 frames. ], tot_loss[loss=0.2533, simple_loss=0.3102, pruned_loss=0.09824, over 3895762.92 frames. ], batch size: 195, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:13:44,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:13:44,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 21:13:46,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=143986.66666666666, ans=0.025 2023-09-28 21:13:50,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:56,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:56,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=143986.66666666666, ans=0.0 2023-09-28 21:13:59,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:59,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:04,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 21:14:05,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:05,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 21:14:08,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:08,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 21:14:10,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:14,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 21:14:15,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:14:17,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:17,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=144120.0, ans=0.1 2023-09-28 21:14:18,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:14:20,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:20,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:21,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:21,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:21,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:14:23,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:14:23,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:27,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=144120.0, ans=0.125 2023-09-28 21:14:29,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=144120.0, ans=0.2 2023-09-28 21:14:30,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:14:30,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:14:32,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:14:32,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:38,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 21:14:38,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:40,676 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.22 vs. limit=10.0 2023-09-28 21:14:41,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=144186.66666666666, ans=0.0 2023-09-28 21:14:44,462 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 2.143e+02 2.367e+02 2.704e+02 4.411e+02, threshold=4.734e+02, percent-clipped=0.0 2023-09-28 21:14:44,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:44,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:14:44,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:48,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 21:14:48,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:14:50,727 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 21:14:52,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 21:14:52,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:55,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:55,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 21:14:57,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:02,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:15:04,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:04,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:04,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:05,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:07,267 INFO [train.py:1039] (2/4) Epoch 5, batch 400, loss[loss=0.2318, simple_loss=0.2885, pruned_loss=0.0875, over 24278.00 frames. ], tot_loss[loss=0.2514, simple_loss=0.3089, pruned_loss=0.09694, over 4085300.74 frames. ], batch size: 56, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:15:08,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:15:11,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:15:13,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 21:15:13,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:14,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:14,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:15:16,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:18,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:20,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:23,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 21:15:26,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 21:15:26,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:27,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 21:15:28,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:33,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:15:33,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:33,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 21:15:33,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:15:33,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:33,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:35,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:37,175 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 21:15:38,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 21:15:43,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:43,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:45,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 21:15:46,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 21:15:47,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=144453.33333333334, ans=0.125 2023-09-28 21:15:49,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:15:52,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:00,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 21:16:04,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:16:06,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 21:16:08,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:16:09,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:16:10,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 21:16:13,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:16:13,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=144586.66666666666, ans=0.0 2023-09-28 21:16:16,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:16:18,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:16:20,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:21,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 21:16:21,369 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:16:24,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:16:24,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 21:16:25,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:16:25,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:16:28,812 INFO [train.py:1039] (2/4) Epoch 5, batch 450, loss[loss=0.2305, simple_loss=0.2966, pruned_loss=0.08215, over 24639.00 frames. ], tot_loss[loss=0.2522, simple_loss=0.3099, pruned_loss=0.09721, over 4221286.46 frames. ], batch size: 65, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:16:28,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 21:16:32,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:16:32,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:16:33,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:16:34,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=144653.33333333334, ans=0.125 2023-09-28 21:16:36,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 21:16:36,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:16:37,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:16:38,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=144653.33333333334, ans=0.125 2023-09-28 21:16:39,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:16:39,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 21:16:39,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:16:39,991 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-09-28 21:16:40,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:16:44,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:16:46,147 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:16:50,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=144720.0, ans=0.04949747468305833 2023-09-28 21:16:53,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:54,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:16:55,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 21:16:56,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 21:16:56,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=144720.0, ans=0.125 2023-09-28 21:16:59,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:17:02,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:05,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:08,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:11,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:14,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 21:17:14,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=144786.66666666666, ans=0.125 2023-09-28 21:17:15,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 21:17:17,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 21:17:17,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:17:19,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:19,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:17:21,567 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 21:17:22,906 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 21:17:22,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:24,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=144853.33333333334, ans=0.125 2023-09-28 21:17:25,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:17:25,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:17:30,457 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.848e+02 2.241e+02 2.627e+02 3.194e+02 6.560e+02, threshold=5.254e+02, percent-clipped=4.0 2023-09-28 21:17:30,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:17:30,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:17:30,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=144853.33333333334, ans=0.2 2023-09-28 21:17:32,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:17:32,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 21:17:35,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:36,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:17:36,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:17:38,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 21:17:41,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=144920.0, ans=0.125 2023-09-28 21:17:43,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:17:43,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 21:17:45,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 21:17:47,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:51,509 INFO [train.py:1039] (2/4) Epoch 5, batch 500, loss[loss=0.271, simple_loss=0.3085, pruned_loss=0.1168, over 23786.00 frames. ], tot_loss[loss=0.2546, simple_loss=0.3118, pruned_loss=0.09863, over 4329953.05 frames. ], batch size: 212, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:17:53,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:17:55,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:17:56,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:17:57,022 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 21:17:59,229 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.35 vs. limit=10.0 2023-09-28 21:18:00,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=144986.66666666666, ans=0.0 2023-09-28 21:18:01,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:01,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:18:01,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:01,895 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 21:18:03,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 21:18:03,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:06,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:18:11,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 21:18:11,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:18:12,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:18:14,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:14,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:16,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=145053.33333333334, ans=0.125 2023-09-28 21:18:25,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:25,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:18:26,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:18:27,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:27,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 21:18:27,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:18:27,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.whiten.whitening_limit, batch_count=145120.0, ans=12.0 2023-09-28 21:18:32,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:18:33,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:18:33,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:18:33,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:33,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 21:18:38,232 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 21:18:39,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:18:40,722 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.73 vs. limit=15.0 2023-09-28 21:18:41,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:18:46,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 21:18:47,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:18:51,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:18:54,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=145186.66666666666, ans=0.07 2023-09-28 21:18:55,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:57,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=145253.33333333334, ans=0.0 2023-09-28 21:18:58,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:19:05,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:07,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 21:19:07,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=145253.33333333334, ans=0.1 2023-09-28 21:19:09,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:09,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:12,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 21:19:12,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:19:15,000 INFO [train.py:1039] (2/4) Epoch 5, batch 550, loss[loss=0.2604, simple_loss=0.3331, pruned_loss=0.09381, over 24331.00 frames. ], tot_loss[loss=0.2572, simple_loss=0.3142, pruned_loss=0.1001, over 4413532.22 frames. ], batch size: 74, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:19:15,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:17,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=145320.0, ans=0.1 2023-09-28 21:19:20,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 21:19:20,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=145320.0, ans=0.0 2023-09-28 21:19:21,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 21:19:23,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:23,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 21:19:23,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:19:23,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:24,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:19:26,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:19:28,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:30,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 21:19:30,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:19:35,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:19:35,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:38,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:19:40,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:44,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 21:19:46,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 21:19:47,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:19:52,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:19:52,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:19:54,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:20:01,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:01,025 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 21:20:01,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:20:03,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:20:06,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:20:06,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:20:06,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:20:08,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:10,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 21:20:11,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 21:20:11,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:11,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:20:13,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:20:13,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:20:16,497 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.228e+02 2.515e+02 3.038e+02 5.618e+02, threshold=5.030e+02, percent-clipped=1.0 2023-09-28 21:20:16,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:20:16,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:20:19,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:20:21,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:21,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 21:20:21,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=145586.66666666666, ans=0.125 2023-09-28 21:20:22,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:20:24,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:24,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:20:26,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:26,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:20:27,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:20:28,231 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:20:33,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 21:20:38,078 INFO [train.py:1039] (2/4) Epoch 5, batch 600, loss[loss=0.3479, simple_loss=0.3692, pruned_loss=0.1633, over 20044.00 frames. ], tot_loss[loss=0.2568, simple_loss=0.3138, pruned_loss=0.09989, over 4483105.82 frames. ], batch size: 388, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:20:38,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 21:20:39,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:20:39,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:20:41,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:46,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=145653.33333333334, ans=0.125 2023-09-28 21:20:46,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.57 vs. limit=22.5 2023-09-28 21:20:48,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:20:50,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:20:51,326 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.12 vs. limit=10.0 2023-09-28 21:20:52,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 21:20:55,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:20:55,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:20:58,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:01,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 21:21:02,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:21:07,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 21:21:09,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=145786.66666666666, ans=0.125 2023-09-28 21:21:11,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:21:12,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:13,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:21:19,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:21:19,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:21:19,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:20,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=145786.66666666666, ans=0.0 2023-09-28 21:21:26,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:21:31,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:31,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:21:31,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:31,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-09-28 21:21:35,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-09-28 21:21:39,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 21:21:44,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:21:44,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:21:49,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 21:21:49,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:21:51,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 21:21:51,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:21:51,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:21:58,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:21:59,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:21:59,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=145986.66666666666, ans=0.0 2023-09-28 21:22:00,853 INFO [train.py:1039] (2/4) Epoch 5, batch 650, loss[loss=0.25, simple_loss=0.32, pruned_loss=0.09, over 24323.00 frames. ], tot_loss[loss=0.2561, simple_loss=0.3126, pruned_loss=0.09978, over 4518409.15 frames. ], batch size: 74, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:22:01,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:02,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:22:05,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:07,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 21:22:08,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:22:13,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:22:13,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:15,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=145986.66666666666, ans=0.125 2023-09-28 21:22:17,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:20,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 21:22:24,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:22:25,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:30,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:22:30,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:22:33,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:33,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:34,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:22:36,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:38,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:22:41,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:22:41,050 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 21:22:41,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:41,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:44,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:44,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:46,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:22:46,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:22:47,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 21:22:50,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:22:50,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=146186.66666666666, ans=0.125 2023-09-28 21:22:50,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=15.0 2023-09-28 21:22:51,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:53,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:22:53,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:53,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:22:53,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=146186.66666666666, ans=0.125 2023-09-28 21:22:55,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 21:22:56,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 21:22:58,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:58,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:58,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:22:58,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:23:01,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:23:04,874 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.282e+02 2.474e+02 2.887e+02 4.172e+02, threshold=4.947e+02, percent-clipped=0.0 2023-09-28 21:23:06,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:06,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:08,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:23:11,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:11,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:23:12,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:18,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=146253.33333333334, ans=0.1 2023-09-28 21:23:19,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:23:19,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:19,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:19,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:24,885 INFO [train.py:1039] (2/4) Epoch 5, batch 700, loss[loss=0.2245, simple_loss=0.2882, pruned_loss=0.08039, over 24487.00 frames. ], tot_loss[loss=0.2532, simple_loss=0.3102, pruned_loss=0.09817, over 4562222.11 frames. ], batch size: 63, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:23:26,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=146320.0, ans=0.1 2023-09-28 21:23:27,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 21:23:27,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 21:23:30,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 21:23:31,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:33,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:23:33,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 21:23:39,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:42,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:23:44,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:46,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:23:46,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:49,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:52,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:23:52,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:23:55,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 21:24:00,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 21:24:05,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:24:05,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:24:07,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:24:10,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:24:12,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 21:24:15,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:15,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:24:15,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 21:24:21,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:24:23,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:25,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:24:30,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:24:31,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 21:24:37,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 21:24:38,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 21:24:38,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:41,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:24:42,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:24:44,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:44,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 21:24:47,184 INFO [train.py:1039] (2/4) Epoch 5, batch 750, loss[loss=0.2604, simple_loss=0.3122, pruned_loss=0.1043, over 23691.00 frames. ], tot_loss[loss=0.2515, simple_loss=0.3088, pruned_loss=0.0971, over 4603439.02 frames. ], batch size: 232, lr: 2.07e-02, grad_scale: 16.0 2023-09-28 21:24:48,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 21:24:48,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 21:24:48,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 21:24:50,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 21:24:52,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 21:24:52,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:24:53,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 21:24:55,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:55,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:24:55,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=146653.33333333334, ans=0.0 2023-09-28 21:24:57,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.54 vs. limit=10.0 2023-09-28 21:24:58,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:00,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:02,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:25:02,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:04,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:25:05,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:25:08,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:25:12,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:14,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:14,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 21:25:14,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:25:17,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:25:21,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 21:25:21,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:25:23,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 21:25:24,860 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 21:25:24,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 21:25:24,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:25:25,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:25:28,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:25:34,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:25:34,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:34,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:25:38,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:39,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:39,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 21:25:41,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:25:42,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 21:25:43,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:25:47,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:25:47,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 21:25:47,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:51,067 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.300e+02 2.781e+02 3.196e+02 5.681e+02, threshold=5.563e+02, percent-clipped=1.0 2023-09-28 21:25:51,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=146853.33333333334, ans=0.125 2023-09-28 21:25:52,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:25:55,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:25:55,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:58,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:26:01,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 21:26:01,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:02,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:02,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=146920.0, ans=0.1 2023-09-28 21:26:07,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:07,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:10,171 INFO [train.py:1039] (2/4) Epoch 5, batch 800, loss[loss=0.2383, simple_loss=0.3072, pruned_loss=0.0847, over 24511.00 frames. ], tot_loss[loss=0.2522, simple_loss=0.3095, pruned_loss=0.09749, over 4633947.86 frames. ], batch size: 66, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:26:10,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:12,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:26:15,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=146986.66666666666, ans=0.125 2023-09-28 21:26:18,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:18,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:22,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:22,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:23,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:23,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:24,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:27,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=147053.33333333334, ans=0.1 2023-09-28 21:26:29,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:31,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:26:34,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 21:26:35,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:37,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:37,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:26:37,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:38,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 21:26:38,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:38,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 21:26:42,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:43,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:45,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:47,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:50,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:50,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:50,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=147120.0, ans=0.1 2023-09-28 21:26:56,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:26:56,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:26:56,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 21:26:58,607 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 21:26:58,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 21:26:58,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:27:00,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:01,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:01,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:07,098 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 21:27:07,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 21:27:08,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:27:09,710 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.53 vs. limit=15.0 2023-09-28 21:27:10,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:27:13,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:27:18,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:18,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 21:27:19,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:27:22,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 21:27:24,377 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.73 vs. limit=15.0 2023-09-28 21:27:31,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:31,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=147320.0, ans=0.0 2023-09-28 21:27:32,898 INFO [train.py:1039] (2/4) Epoch 5, batch 850, loss[loss=0.2529, simple_loss=0.329, pruned_loss=0.08843, over 24506.00 frames. ], tot_loss[loss=0.2531, simple_loss=0.3103, pruned_loss=0.09798, over 4655428.39 frames. ], batch size: 69, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:27:33,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:27:34,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 21:27:34,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:27:36,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:38,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 21:27:38,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:40,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:27:42,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:43,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:27:44,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=147320.0, ans=0.125 2023-09-28 21:27:45,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:46,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 21:27:46,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 21:27:46,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 21:27:47,767 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.19 vs. limit=15.0 2023-09-28 21:27:48,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:48,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:27:51,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:51,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:52,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:27:58,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:58,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:58,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 21:28:03,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 21:28:03,674 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-09-28 21:28:06,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:28:07,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 21:28:11,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 21:28:12,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 21:28:14,757 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 21:28:14,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:14,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:28:14,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:28:18,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 21:28:23,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:24,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:24,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:28:24,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:28:26,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:28:28,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:28:28,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 21:28:34,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:28:34,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:35,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:28:36,310 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.721e+02 2.245e+02 2.598e+02 3.142e+02 5.686e+02, threshold=5.195e+02, percent-clipped=1.0 2023-09-28 21:28:36,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:36,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:38,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:40,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:28:41,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:28:41,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=147586.66666666666, ans=0.0 2023-09-28 21:28:42,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:28:43,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:28:51,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:28:53,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:53,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 21:28:54,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:28:54,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:56,152 INFO [train.py:1039] (2/4) Epoch 5, batch 900, loss[loss=0.2124, simple_loss=0.2759, pruned_loss=0.07445, over 24312.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3107, pruned_loss=0.09816, over 4667548.51 frames. ], batch size: 56, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:28:57,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 21:29:05,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:29:07,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:07,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 21:29:09,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=147653.33333333334, ans=0.0 2023-09-28 21:29:11,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:29:12,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 21:29:12,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:29:14,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:29:14,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:14,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:29:15,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:29:26,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:29:26,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:28,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:29:31,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:33,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=147786.66666666666, ans=0.0 2023-09-28 21:29:37,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 21:29:38,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:29:42,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:29:42,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:29:42,238 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 21:29:43,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 21:29:52,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:29:52,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:29:52,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:29:54,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=147853.33333333334, ans=0.125 2023-09-28 21:29:59,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:00,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:01,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 21:30:01,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:30:05,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 21:30:08,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:30:08,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:10,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:30:10,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:15,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 21:30:15,069 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 21:30:16,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:30:16,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 21:30:18,139 INFO [train.py:1039] (2/4) Epoch 5, batch 950, loss[loss=0.2609, simple_loss=0.3236, pruned_loss=0.09905, over 23711.00 frames. ], tot_loss[loss=0.2529, simple_loss=0.311, pruned_loss=0.09745, over 4689274.60 frames. ], batch size: 85, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:30:19,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:20,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=147986.66666666666, ans=0.2 2023-09-28 21:30:25,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 21:30:26,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.25 vs. limit=15.0 2023-09-28 21:30:27,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=147986.66666666666, ans=0.125 2023-09-28 21:30:31,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:33,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:33,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:33,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=148053.33333333334, ans=0.125 2023-09-28 21:30:35,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:30:35,346 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 21:30:38,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:40,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:30:40,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:40,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:30:42,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 21:30:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:30:45,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:47,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 21:30:47,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:50,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:50,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:51,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:53,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 21:30:56,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:30:57,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:31:00,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:31:04,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:04,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:31:07,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 21:31:08,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:31:08,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:31:10,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:11,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:11,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:31:17,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 21:31:19,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:31:21,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=148186.66666666666, ans=0.1 2023-09-28 21:31:21,889 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.107e+02 2.418e+02 2.816e+02 4.980e+02, threshold=4.836e+02, percent-clipped=0.0 2023-09-28 21:31:22,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:22,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:22,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 21:31:22,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:22,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:31:23,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 21:31:24,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=148253.33333333334, ans=0.025 2023-09-28 21:31:25,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=148253.33333333334, ans=0.125 2023-09-28 21:31:27,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=148253.33333333334, ans=0.0 2023-09-28 21:31:28,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:31:30,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:30,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=148253.33333333334, ans=0.05 2023-09-28 21:31:35,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:31:36,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 21:31:36,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 21:31:41,331 INFO [train.py:1039] (2/4) Epoch 5, batch 1000, loss[loss=0.2414, simple_loss=0.3094, pruned_loss=0.08675, over 24498.00 frames. ], tot_loss[loss=0.2526, simple_loss=0.3103, pruned_loss=0.09745, over 4693855.68 frames. ], batch size: 69, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:31:41,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:45,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 21:31:45,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:50,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:31:52,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 21:31:52,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 21:31:53,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=148320.0, ans=0.07 2023-09-28 21:32:00,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:00,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:32:02,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:04,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 21:32:06,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=148386.66666666666, ans=0.0 2023-09-28 21:32:08,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 21:32:10,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=148386.66666666666, ans=0.125 2023-09-28 21:32:11,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 21:32:11,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:11,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=148386.66666666666, ans=0.125 2023-09-28 21:32:12,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 21:32:14,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 21:32:14,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 21:32:14,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:16,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:23,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:24,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:32:26,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:26,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:26,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 21:32:28,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:28,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:32:28,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=148453.33333333334, ans=0.125 2023-09-28 21:32:29,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:29,756 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 21:32:30,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=148520.0, ans=0.125 2023-09-28 21:32:34,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 21:32:34,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 21:32:37,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 21:32:39,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:32:39,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=148520.0, ans=0.2 2023-09-28 21:32:43,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.91 vs. limit=12.0 2023-09-28 21:32:47,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:47,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:32:47,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:49,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:32:50,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 21:32:53,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:32:53,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 21:32:54,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 21:32:56,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:32:56,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:59,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:33:02,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:33:03,571 INFO [train.py:1039] (2/4) Epoch 5, batch 1050, loss[loss=0.2677, simple_loss=0.3272, pruned_loss=0.1042, over 23456.00 frames. ], tot_loss[loss=0.2516, simple_loss=0.3094, pruned_loss=0.09687, over 4703737.29 frames. ], batch size: 93, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:33:03,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:06,737 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.77 vs. limit=22.5 2023-09-28 21:33:07,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:33:09,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:33:10,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:33:12,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:14,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:16,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:33:18,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:33:21,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:33:21,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:33:21,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:33:24,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:33:25,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 21:33:26,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:28,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 21:33:29,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:33:29,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 21:33:29,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:33:34,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:36,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:33:36,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:37,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.29 vs. limit=15.0 2023-09-28 21:33:39,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 21:33:39,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 21:33:39,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:45,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 21:33:48,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 21:33:49,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:53,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:33:55,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:33:55,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:33:56,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:34:01,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:34:03,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=148853.33333333334, ans=0.125 2023-09-28 21:34:04,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 21:34:06,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 21:34:06,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 21:34:07,749 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.205e+02 2.391e+02 2.864e+02 4.460e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-28 21:34:07,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:07,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:34:08,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=148853.33333333334, ans=0.2 2023-09-28 21:34:10,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 21:34:12,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=148920.0, ans=0.2 2023-09-28 21:34:14,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:34:17,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:17,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:17,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:17,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 21:34:23,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:23,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 21:34:23,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 21:34:24,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:34:26,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=148986.66666666666, ans=0.125 2023-09-28 21:34:27,641 INFO [train.py:1039] (2/4) Epoch 5, batch 1100, loss[loss=0.259, simple_loss=0.3112, pruned_loss=0.1034, over 23927.00 frames. ], tot_loss[loss=0.2522, simple_loss=0.3097, pruned_loss=0.09737, over 4713615.72 frames. ], batch size: 196, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:34:29,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:34:34,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:34:40,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:34:41,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:34:41,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:43,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 21:34:44,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:34:47,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:34:49,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:34:51,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:34:52,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 21:34:55,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:34:56,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:56,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:58,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:34:59,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:35:01,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=149120.0, ans=10.0 2023-09-28 21:35:05,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:35:06,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=149120.0, ans=0.125 2023-09-28 21:35:08,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 21:35:11,130 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 21:35:11,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:15,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:15,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:35:15,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=149186.66666666666, ans=0.125 2023-09-28 21:35:17,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:35:17,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 21:35:18,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:35:18,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:35:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:35:20,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:20,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 21:35:20,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=149186.66666666666, ans=0.0 2023-09-28 21:35:26,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:35:26,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 21:35:29,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:35:33,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:35:35,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 21:35:35,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:35:37,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:40,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:35:41,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:41,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 21:35:43,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:35:45,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:45,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 21:35:45,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:35:47,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 21:35:48,544 INFO [train.py:1039] (2/4) Epoch 5, batch 1150, loss[loss=0.2816, simple_loss=0.3336, pruned_loss=0.1148, over 23278.00 frames. ], tot_loss[loss=0.253, simple_loss=0.3106, pruned_loss=0.09767, over 4713033.96 frames. ], batch size: 105, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:35:48,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:35:48,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:35:50,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:35:52,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=149320.0, ans=0.1 2023-09-28 21:35:56,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:35:57,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:35:59,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:01,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:36:02,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 21:36:02,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:05,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 21:36:05,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:05,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:36:10,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=149386.66666666666, ans=0.125 2023-09-28 21:36:12,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 21:36:14,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:17,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:17,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=149386.66666666666, ans=0.125 2023-09-28 21:36:19,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:19,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 21:36:19,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:36:21,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:24,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 21:36:26,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:28,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:41,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 21:36:47,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:48,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:50,953 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.787e+02 2.166e+02 2.435e+02 2.809e+02 4.003e+02, threshold=4.871e+02, percent-clipped=0.0 2023-09-28 21:36:51,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=149520.0, ans=0.05 2023-09-28 21:36:52,862 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 21:36:54,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:59,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=149586.66666666666, ans=0.1 2023-09-28 21:37:04,043 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 21:37:07,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:07,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:37:09,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:37:09,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:37:10,903 INFO [train.py:1039] (2/4) Epoch 5, batch 1200, loss[loss=0.2619, simple_loss=0.3134, pruned_loss=0.1052, over 23624.00 frames. ], tot_loss[loss=0.255, simple_loss=0.3118, pruned_loss=0.09908, over 4696358.97 frames. ], batch size: 232, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:37:13,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:18,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=149653.33333333334, ans=0.1 2023-09-28 21:37:20,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:37:20,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:37:22,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:22,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:22,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:37:24,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=149653.33333333334, ans=0.1 2023-09-28 21:37:25,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:37:28,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:37:30,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:30,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:37:31,957 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 21:37:34,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.63 vs. limit=15.0 2023-09-28 21:37:35,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 21:37:38,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:37:41,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:37:41,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=149786.66666666666, ans=0.125 2023-09-28 21:37:43,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:45,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=149786.66666666666, ans=0.125 2023-09-28 21:37:47,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:37:47,354 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 21:37:48,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:55,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:37:55,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:37:55,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 21:37:57,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:38:00,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 21:38:04,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 21:38:04,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:38:06,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:38:06,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=149853.33333333334, ans=0.95 2023-09-28 21:38:09,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:09,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:38:11,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:38:11,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:38:12,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:38:13,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 21:38:14,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:38:14,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:14,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:38:18,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:18,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:23,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:38:24,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:38:28,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 21:38:31,827 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 21:38:33,134 INFO [train.py:1039] (2/4) Epoch 5, batch 1250, loss[loss=0.1943, simple_loss=0.2629, pruned_loss=0.06285, over 24299.00 frames. ], tot_loss[loss=0.2554, simple_loss=0.3127, pruned_loss=0.09903, over 4706700.27 frames. ], batch size: 56, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:38:34,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:38:36,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:37,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:38:38,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=149986.66666666666, ans=0.125 2023-09-28 21:38:39,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:41,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 21:38:44,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:38:46,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:47,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 21:38:47,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=150053.33333333334, ans=0.0 2023-09-28 21:38:49,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:38:51,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:38:54,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=150053.33333333334, ans=0.0 2023-09-28 21:38:56,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:38:56,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:57,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:38:57,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:00,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:39:03,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 21:39:03,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:04,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:05,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=150120.0, ans=0.0 2023-09-28 21:39:06,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:09,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:12,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:39:16,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 21:39:17,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:39:20,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:20,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 21:39:20,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:39:20,939 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 21:39:22,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:22,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:26,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:39:31,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 21:39:31,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 21:39:32,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 21:39:36,088 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.883e+02 2.272e+02 2.528e+02 2.863e+02 4.623e+02, threshold=5.057e+02, percent-clipped=0.0 2023-09-28 21:39:36,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:39:37,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 21:39:37,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:39,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:39:40,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:39:41,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 21:39:41,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:39:42,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:39:42,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:39:42,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:46,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 21:39:47,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:48,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=150253.33333333334, ans=0.125 2023-09-28 21:39:50,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:39:52,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:39:53,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:55,811 INFO [train.py:1039] (2/4) Epoch 5, batch 1300, loss[loss=0.2482, simple_loss=0.2948, pruned_loss=0.1008, over 23608.00 frames. ], tot_loss[loss=0.2552, simple_loss=0.3125, pruned_loss=0.09897, over 4720846.91 frames. ], batch size: 256, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:39:57,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:57,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 21:40:03,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:06,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:40:06,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:08,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:40:10,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:40:10,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 21:40:14,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=150386.66666666666, ans=0.125 2023-09-28 21:40:16,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:40:17,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:40:17,703 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.60 vs. limit=22.5 2023-09-28 21:40:18,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=150386.66666666666, ans=0.0 2023-09-28 21:40:20,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 21:40:22,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:40:26,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:27,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:30,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:30,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:32,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:40:32,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:40:34,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 21:40:39,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:40:39,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:40:42,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 21:40:42,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:40:45,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:40:48,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:48,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 21:40:48,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:48,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 21:40:51,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:51,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=150520.0, ans=0.125 2023-09-28 21:40:53,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:53,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:40:58,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 21:40:59,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 21:41:01,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 21:41:06,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:41:09,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 21:41:11,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:18,065 INFO [train.py:1039] (2/4) Epoch 5, batch 1350, loss[loss=0.2525, simple_loss=0.3164, pruned_loss=0.09428, over 24475.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3109, pruned_loss=0.09809, over 4725682.67 frames. ], batch size: 63, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:41:18,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 21:41:21,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:21,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=150653.33333333334, ans=0.1 2023-09-28 21:41:22,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=150653.33333333334, ans=0.0 2023-09-28 21:41:24,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:27,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:27,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:31,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:41:31,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:35,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:38,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 21:41:41,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:41:41,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:41:41,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=150720.0, ans=0.2 2023-09-28 21:41:43,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 21:41:43,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:41:46,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:41:46,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 21:41:47,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 21:41:49,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=150786.66666666666, ans=0.0 2023-09-28 21:41:51,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 21:41:52,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:52,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 21:42:03,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:15,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:15,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:16,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 21:42:19,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:20,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 21:42:20,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:42:22,131 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.340e+02 2.561e+02 2.889e+02 4.488e+02, threshold=5.123e+02, percent-clipped=0.0 2023-09-28 21:42:22,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:42:25,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:42:26,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.66 vs. limit=22.5 2023-09-28 21:42:27,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 21:42:30,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:42:35,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 21:42:36,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 21:42:37,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=150920.0, ans=0.0 2023-09-28 21:42:40,569 INFO [train.py:1039] (2/4) Epoch 5, batch 1400, loss[loss=0.2553, simple_loss=0.3026, pruned_loss=0.104, over 23810.00 frames. ], tot_loss[loss=0.2507, simple_loss=0.3085, pruned_loss=0.0965, over 4715007.92 frames. ], batch size: 179, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:42:43,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 21:42:45,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:47,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:42:47,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=150986.66666666666, ans=0.125 2023-09-28 21:42:49,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:42:55,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 21:42:56,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 21:43:08,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:43:09,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:11,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:43:11,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:43:16,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:43:16,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:43:20,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=151120.0, ans=0.125 2023-09-28 21:43:24,587 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.80 vs. limit=15.0 2023-09-28 21:43:27,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:27,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:32,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 21:43:32,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:43:32,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:43:33,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:43:34,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=151186.66666666666, ans=0.125 2023-09-28 21:43:35,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:35,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:43:37,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:43:37,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:43:37,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=151186.66666666666, ans=0.125 2023-09-28 21:43:38,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 21:43:38,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:43:43,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:48,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:43:56,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 21:43:58,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:43:58,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:44:01,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:44:02,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:03,482 INFO [train.py:1039] (2/4) Epoch 5, batch 1450, loss[loss=0.215, simple_loss=0.2926, pruned_loss=0.06868, over 24647.00 frames. ], tot_loss[loss=0.2497, simple_loss=0.3077, pruned_loss=0.09581, over 4717202.19 frames. ], batch size: 68, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:44:05,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:44:05,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.18 vs. limit=15.0 2023-09-28 21:44:08,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:44:10,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:44:10,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:10,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:44:14,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:16,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:44:16,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:44:17,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 21:44:19,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:44:19,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 21:44:19,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:20,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:20,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 21:44:24,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:25,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:44:25,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 21:44:26,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:26,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:44:26,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=151386.66666666666, ans=0.0 2023-09-28 21:44:29,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:31,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:31,776 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:44:34,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:44:34,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:44:37,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:37,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:40,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:41,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:44:41,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:41,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:44:44,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 21:44:47,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:52,389 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 21:44:53,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:44:56,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:44:57,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:44:59,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 21:45:04,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:06,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 21:45:07,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.862e+02 2.279e+02 2.648e+02 3.024e+02 3.849e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 21:45:07,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 21:45:09,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:11,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:12,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:45:14,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 21:45:14,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=151586.66666666666, ans=0.125 2023-09-28 21:45:17,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 21:45:17,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 21:45:19,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:20,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:45:25,525 INFO [train.py:1039] (2/4) Epoch 5, batch 1500, loss[loss=0.2593, simple_loss=0.3226, pruned_loss=0.09799, over 24021.00 frames. ], tot_loss[loss=0.2506, simple_loss=0.3083, pruned_loss=0.0964, over 4714518.36 frames. ], batch size: 80, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:45:32,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 21:45:32,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:45:32,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:45:32,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:33,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:35,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:45:37,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 21:45:39,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:45:39,715 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:45:40,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:45:40,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:42,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:43,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:45:44,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:50,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:50,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 21:45:50,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=151720.0, ans=0.05 2023-09-28 21:45:51,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:45:51,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:45:51,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=151720.0, ans=0.125 2023-09-28 21:45:51,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=151720.0, ans=0.125 2023-09-28 21:45:53,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:55,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=151720.0, ans=0.07 2023-09-28 21:45:56,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 21:46:00,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 21:46:01,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:01,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 21:46:04,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:46:06,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:08,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:46:08,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:09,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 21:46:09,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:46:09,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:11,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 21:46:11,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:18,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:46:18,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 21:46:22,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:46:24,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:46:29,569 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 21:46:30,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:30,884 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 21:46:32,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:46:33,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:46:34,057 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 21:46:35,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:46:39,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 21:46:39,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=151920.0, ans=0.125 2023-09-28 21:46:40,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:44,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:44,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:44,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:45,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:45,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:47,872 INFO [train.py:1039] (2/4) Epoch 5, batch 1550, loss[loss=0.2799, simple_loss=0.3508, pruned_loss=0.1046, over 24334.00 frames. ], tot_loss[loss=0.2515, simple_loss=0.3097, pruned_loss=0.09666, over 4722827.28 frames. ], batch size: 77, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:46:48,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 21:46:49,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 21:46:49,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:46:51,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 21:46:52,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 21:46:54,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:55,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:56,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:56,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:46:57,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:57,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:47:02,064 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 21:47:02,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:02,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:47:02,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:47:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:47:05,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 21:47:07,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:47:08,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 21:47:08,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 21:47:08,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 21:47:08,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:12,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:12,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=152053.33333333334, ans=0.04949747468305833 2023-09-28 21:47:17,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:47:17,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=152053.33333333334, ans=0.125 2023-09-28 21:47:20,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 21:47:20,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 21:47:23,103 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.11 vs. limit=15.0 2023-09-28 21:47:27,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:30,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:47:32,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:47:32,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:47:32,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 21:47:38,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:47:39,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:42,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:47:45,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:47:47,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:47,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 21:47:47,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:47:50,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:47:50,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:51,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 21:47:52,239 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.346e+02 2.949e+02 3.489e+02 5.626e+02, threshold=5.898e+02, percent-clipped=1.0 2023-09-28 21:47:52,338 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 21:47:54,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:47:59,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 21:48:05,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:06,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:08,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 21:48:09,966 INFO [train.py:1039] (2/4) Epoch 5, batch 1600, loss[loss=0.2558, simple_loss=0.3019, pruned_loss=0.1048, over 23968.00 frames. ], tot_loss[loss=0.2531, simple_loss=0.3108, pruned_loss=0.09769, over 4728841.55 frames. ], batch size: 196, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:48:10,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:48:12,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:12,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:48:12,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:48:13,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:48:18,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:19,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 21:48:19,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 21:48:21,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 21:48:21,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=152320.0, ans=0.1 2023-09-28 21:48:25,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:48:26,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 21:48:28,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:48:30,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:48:35,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:48:38,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 21:48:43,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:48:43,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=152453.33333333334, ans=0.1 2023-09-28 21:48:44,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 21:48:44,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:44,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 21:48:44,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=152453.33333333334, ans=0.125 2023-09-28 21:48:49,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 21:48:58,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:59,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 21:49:00,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:49:00,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:00,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:49:01,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 21:49:07,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 21:49:08,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:49:08,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:09,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:10,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:49:10,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=152520.0, ans=0.125 2023-09-28 21:49:13,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:49:13,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:49:13,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=152520.0, ans=0.125 2023-09-28 21:49:15,469 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.48 vs. limit=6.0 2023-09-28 21:49:16,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:49:22,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:23,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:49:26,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=152586.66666666666, ans=0.125 2023-09-28 21:49:27,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 21:49:27,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:49:29,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 21:49:32,702 INFO [train.py:1039] (2/4) Epoch 5, batch 1650, loss[loss=0.2692, simple_loss=0.3243, pruned_loss=0.1071, over 18549.00 frames. ], tot_loss[loss=0.252, simple_loss=0.3107, pruned_loss=0.0967, over 4710101.62 frames. ], batch size: 40, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:49:34,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:35,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:49:37,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:49:37,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 21:49:37,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 21:49:37,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 21:49:37,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 21:49:43,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:44,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:44,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:49:45,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:49:46,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:48,237 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.53 vs. limit=12.0 2023-09-28 21:49:50,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 21:49:52,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:52,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:52,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:49:52,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:49:52,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=152720.0, ans=0.125 2023-09-28 21:49:53,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 21:49:53,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 21:49:57,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=152720.0, ans=15.0 2023-09-28 21:49:58,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:49:58,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=152720.0, ans=0.0 2023-09-28 21:50:02,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:50:09,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.96 vs. limit=15.0 2023-09-28 21:50:10,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 21:50:12,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:16,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 21:50:18,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.11 vs. limit=6.0 2023-09-28 21:50:19,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:22,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:50:22,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:50:22,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:23,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:50:23,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:24,635 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.54 vs. limit=15.0 2023-09-28 21:50:27,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:27,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:28,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:28,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:30,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:30,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:50:33,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:34,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 21:50:35,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=152853.33333333334, ans=0.05 2023-09-28 21:50:36,171 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.210e+02 2.496e+02 2.822e+02 4.651e+02, threshold=4.993e+02, percent-clipped=0.0 2023-09-28 21:50:38,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:38,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 21:50:38,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 21:50:40,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 21:50:40,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:41,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:50:41,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:41,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 21:50:45,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:46,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=152920.0, ans=0.125 2023-09-28 21:50:46,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:50:46,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:50,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 21:50:54,697 INFO [train.py:1039] (2/4) Epoch 5, batch 1700, loss[loss=0.28, simple_loss=0.3183, pruned_loss=0.1208, over 23774.00 frames. ], tot_loss[loss=0.2513, simple_loss=0.3091, pruned_loss=0.09681, over 4711971.57 frames. ], batch size: 164, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:50:56,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:56,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:50:56,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 21:50:57,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:50:57,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:50:57,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:58,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=152986.66666666666, ans=0.125 2023-09-28 21:50:59,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:50:59,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:51:01,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 21:51:04,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:51:05,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=152986.66666666666, ans=0.125 2023-09-28 21:51:13,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:51:15,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:51:22,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:51:22,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:24,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:51:24,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:25,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.87 vs. limit=15.0 2023-09-28 21:51:26,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 21:51:28,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:51:28,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:29,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:51:31,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:51:34,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 21:51:34,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 21:51:35,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:39,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 21:51:39,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:51:41,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.41 vs. limit=15.0 2023-09-28 21:51:48,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:51:49,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:51:50,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:53,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:51:54,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 21:51:54,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:57,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:57,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 21:51:58,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:51:58,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:51:58,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:59,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:00,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:00,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:52:02,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:02,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:52:02,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:05,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:05,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 21:52:09,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:10,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:12,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 21:52:16,924 INFO [train.py:1039] (2/4) Epoch 5, batch 1750, loss[loss=0.2674, simple_loss=0.3147, pruned_loss=0.11, over 23649.00 frames. ], tot_loss[loss=0.2506, simple_loss=0.308, pruned_loss=0.09656, over 4713766.76 frames. ], batch size: 232, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:52:20,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:22,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:22,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:52:25,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 21:52:25,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:52:27,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:52:27,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:30,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 21:52:34,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:36,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 21:52:36,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:38,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:52:42,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:52:43,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 21:52:44,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:52:45,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 21:52:54,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:52:57,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:52:57,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:02,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:02,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:05,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:05,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:05,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=153520.0, ans=0.05 2023-09-28 21:53:09,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:09,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:53:10,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 21:53:13,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:17,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 21:53:18,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:20,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:21,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:53:23,258 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.199e+02 2.496e+02 2.934e+02 4.192e+02, threshold=4.992e+02, percent-clipped=0.0 2023-09-28 21:53:24,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:53:25,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:53:26,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:28,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:34,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:53:36,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:53:36,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 21:53:36,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:38,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:53:38,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:53:38,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:53:38,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:53:39,939 INFO [train.py:1039] (2/4) Epoch 5, batch 1800, loss[loss=0.2477, simple_loss=0.3207, pruned_loss=0.08733, over 24668.00 frames. ], tot_loss[loss=0.2504, simple_loss=0.3084, pruned_loss=0.09623, over 4724791.30 frames. ], batch size: 73, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:53:40,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:53:42,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:53:42,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=153653.33333333334, ans=0.0 2023-09-28 21:53:43,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:45,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:53:48,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:52,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 21:53:53,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:55,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=153720.0, ans=0.125 2023-09-28 21:53:56,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:01,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:01,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:02,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.11 vs. limit=15.0 2023-09-28 21:54:02,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:54:06,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:54:06,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 21:54:06,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:08,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:13,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 21:54:16,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 21:54:16,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 21:54:18,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:18,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:18,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:54:19,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:54:26,505 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 21:54:27,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.18 vs. limit=15.0 2023-09-28 21:54:28,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:54:28,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:31,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 21:54:31,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 21:54:32,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:54:32,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:54:34,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:54:39,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 21:54:44,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:54:46,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 21:54:46,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:54:46,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:47,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:54:47,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 21:54:51,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:54:51,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:54:55,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 21:54:55,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:57,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:54:59,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:54:59,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:00,435 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.41 vs. limit=10.0 2023-09-28 21:55:01,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:01,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:55:02,693 INFO [train.py:1039] (2/4) Epoch 5, batch 1850, loss[loss=0.2294, simple_loss=0.3046, pruned_loss=0.0771, over 24441.00 frames. ], tot_loss[loss=0.2514, simple_loss=0.3091, pruned_loss=0.09687, over 4720136.32 frames. ], batch size: 66, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:55:04,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:55:04,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:07,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:55:07,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:15,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:55:15,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 21:55:19,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 21:55:21,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 21:55:26,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:55:26,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 21:55:26,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:55:36,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:55:37,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 21:55:40,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:55:40,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:55:44,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 21:55:44,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:46,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:55:47,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:55:49,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:49,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=154120.0, ans=0.2 2023-09-28 21:55:52,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:57,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:55:57,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:58,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:55:58,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:00,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:02,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:56:05,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 21:56:08,702 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.275e+02 2.646e+02 3.136e+02 5.874e+02, threshold=5.291e+02, percent-clipped=3.0 2023-09-28 21:56:08,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:13,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:56:13,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:56:13,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 21:56:13,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 21:56:16,369 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 21:56:16,504 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 21:56:18,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:56:20,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:56:20,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:20,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:21,533 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 21:56:22,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:56:22,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:23,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=154320.0, ans=0.125 2023-09-28 21:56:24,320 INFO [train.py:1039] (2/4) Epoch 5, batch 1900, loss[loss=0.246, simple_loss=0.3021, pruned_loss=0.09498, over 23836.00 frames. ], tot_loss[loss=0.2519, simple_loss=0.3097, pruned_loss=0.097, over 4723968.56 frames. ], batch size: 179, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:56:24,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:56:26,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:56:27,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:56:28,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 21:56:31,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:31,174 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 21:56:31,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:56:32,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:35,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:39,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:56:41,089 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 21:56:42,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 21:56:44,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:46,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:56:46,140 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 21:56:46,194 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 21:56:49,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 21:56:50,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:56:55,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 21:56:58,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 21:57:05,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 21:57:08,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 21:57:08,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:08,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=154453.33333333334, ans=0.125 2023-09-28 21:57:10,062 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 21:57:10,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 21:57:12,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 21:57:12,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 21:57:12,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:57:15,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 21:57:20,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:57:21,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=154520.0, ans=22.5 2023-09-28 21:57:22,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:22,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 21:57:25,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:57:25,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=154520.0, ans=0.125 2023-09-28 21:57:26,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 21:57:28,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:35,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:57:35,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:57:35,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:57:36,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:57:38,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:57:38,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 21:57:39,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:57:42,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:42,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:57:46,292 INFO [train.py:1039] (2/4) Epoch 5, batch 1950, loss[loss=0.2495, simple_loss=0.3155, pruned_loss=0.09178, over 23398.00 frames. ], tot_loss[loss=0.2525, simple_loss=0.3104, pruned_loss=0.09726, over 4706901.00 frames. ], batch size: 93, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:57:46,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:57:46,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:46,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:48,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:50,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=154653.33333333334, ans=0.125 2023-09-28 21:57:51,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:57:53,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:57:55,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:55,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:57:57,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=154653.33333333334, ans=0.0 2023-09-28 21:57:58,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 21:57:58,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:57:59,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:59,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:01,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:58:03,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:03,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:06,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:09,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:58:09,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:58:09,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:58:09,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:10,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=154720.0, ans=0.2 2023-09-28 21:58:13,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=154720.0, ans=0.0 2023-09-28 21:58:14,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:17,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:58:17,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:17,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:58:17,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 21:58:17,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=154786.66666666666, ans=0.125 2023-09-28 21:58:19,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:58:19,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:58:21,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:24,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:29,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:58:32,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:58:35,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:58:35,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:58:36,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=154853.33333333334, ans=0.1 2023-09-28 21:58:37,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 21:58:37,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:58:39,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=154853.33333333334, ans=0.2 2023-09-28 21:58:41,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:41,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:58:42,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:58:50,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:52,295 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.827e+02 2.269e+02 2.574e+02 2.905e+02 4.607e+02, threshold=5.149e+02, percent-clipped=0.0 2023-09-28 21:58:52,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:55,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:57,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:00,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:59:00,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:02,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 21:59:02,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:59:03,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:04,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 21:59:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:08,889 INFO [train.py:1039] (2/4) Epoch 5, batch 2000, loss[loss=0.2272, simple_loss=0.2935, pruned_loss=0.08047, over 24464.00 frames. ], tot_loss[loss=0.2531, simple_loss=0.3114, pruned_loss=0.09733, over 4712902.68 frames. ], batch size: 63, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 21:59:10,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:59:12,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:59:12,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:59:15,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:59:17,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:18,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 21:59:20,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:59:24,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:59:25,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 21:59:25,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=155053.33333333334, ans=0.125 2023-09-28 21:59:27,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:59:27,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:30,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:59:30,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 21:59:33,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:37,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 21:59:37,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:59:38,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 21:59:38,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:39,511 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=12.0 2023-09-28 21:59:41,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:59:43,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:59:43,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:45,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:59:45,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=155120.0, ans=0.0 2023-09-28 21:59:45,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=155120.0, ans=0.125 2023-09-28 21:59:46,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:59:46,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 21:59:48,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 21:59:48,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:48,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:53,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=155120.0, ans=0.1 2023-09-28 21:59:55,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:58,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:59:58,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:59:58,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:00:01,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:01,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:01,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:00:01,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:03,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:05,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:00:06,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 22:00:13,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:00:14,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:18,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:18,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:00:20,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=155253.33333333334, ans=0.0 2023-09-28 22:00:21,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:23,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:23,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:24,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:00:24,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:00:26,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=155253.33333333334, ans=0.125 2023-09-28 22:00:27,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:29,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:31,034 INFO [train.py:1039] (2/4) Epoch 5, batch 2050, loss[loss=0.2659, simple_loss=0.2881, pruned_loss=0.1219, over 19793.00 frames. ], tot_loss[loss=0.2526, simple_loss=0.3101, pruned_loss=0.09758, over 4703876.01 frames. ], batch size: 389, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:00:31,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:32,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:37,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:40,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:00:40,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:41,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:00:42,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 22:00:42,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:00:44,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:44,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:00:54,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:00:54,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:59,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 22:01:00,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=155386.66666666666, ans=0.125 2023-09-28 22:01:02,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:01:04,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 22:01:04,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:01:08,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:08,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=155453.33333333334, ans=0.0 2023-09-28 22:01:10,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:10,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=155453.33333333334, ans=0.125 2023-09-28 22:01:11,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:01:13,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:14,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:01:16,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:01:16,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:01:19,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:21,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:01:24,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:01:24,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:01:28,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:31,190 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=1.383e-02 2023-09-28 22:01:31,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=155520.0, ans=0.2 2023-09-28 22:01:33,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:01:34,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 22:01:36,989 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.377e+02 2.692e+02 3.102e+02 5.014e+02, threshold=5.385e+02, percent-clipped=0.0 2023-09-28 22:01:41,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:42,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:01:43,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=155586.66666666666, ans=0.125 2023-09-28 22:01:45,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:01:47,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 22:01:49,218 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 22:01:49,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:01:49,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:51,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:01:52,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:52,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 22:01:52,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 22:01:54,306 INFO [train.py:1039] (2/4) Epoch 5, batch 2100, loss[loss=0.2593, simple_loss=0.3069, pruned_loss=0.1058, over 23658.00 frames. ], tot_loss[loss=0.2508, simple_loss=0.3086, pruned_loss=0.09647, over 4714791.55 frames. ], batch size: 232, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:01:54,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:57,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:01:58,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=155653.33333333334, ans=0.07 2023-09-28 22:01:59,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:02:01,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:01,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:02:01,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 22:02:04,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:02:04,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 22:02:04,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 22:02:05,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:05,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:07,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 22:02:07,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:02:15,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 22:02:15,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:02:15,628 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:02:18,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:02:18,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:02:23,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:02:23,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 22:02:25,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:25,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 22:02:28,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 22:02:28,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:28,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 22:02:28,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 22:02:28,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 22:02:29,323 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.22 vs. limit=10.0 2023-09-28 22:02:31,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:02:31,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=155786.66666666666, ans=0.125 2023-09-28 22:02:33,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:02:33,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=155786.66666666666, ans=0.125 2023-09-28 22:02:35,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:35,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:37,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=155786.66666666666, ans=0.125 2023-09-28 22:02:38,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:39,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:39,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 22:02:39,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:39,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:41,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:43,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 22:02:44,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 22:02:46,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 22:02:50,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:02:54,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:55,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 22:03:00,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:00,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=155920.0, ans=0.2 2023-09-28 22:03:03,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.39 vs. limit=15.0 2023-09-28 22:03:04,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:03:04,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:04,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:05,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 22:03:05,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:07,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:07,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:03:09,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:03:09,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=155920.0, ans=0.1 2023-09-28 22:03:10,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:12,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 22:03:13,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 22:03:15,137 INFO [train.py:1039] (2/4) Epoch 5, batch 2150, loss[loss=0.2596, simple_loss=0.3102, pruned_loss=0.1045, over 23782.00 frames. ], tot_loss[loss=0.2501, simple_loss=0.3081, pruned_loss=0.09608, over 4709038.90 frames. ], batch size: 179, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:03:15,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:18,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:03:18,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:03:18,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:03:18,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:03:25,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 22:03:27,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:28,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:29,387 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.60 vs. limit=22.5 2023-09-28 22:03:30,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:03:30,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:31,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:03:36,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:36,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:03:36,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:03:40,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=156053.33333333334, ans=0.035 2023-09-28 22:03:41,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:41,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 22:03:48,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:48,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:03:49,289 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.57 vs. limit=10.0 2023-09-28 22:03:49,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:49,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:03:51,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:51,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:51,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:53,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 22:03:53,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=156120.0, ans=0.2 2023-09-28 22:03:53,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=156120.0, ans=0.125 2023-09-28 22:03:54,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:03:55,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:56,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:57,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=156120.0, ans=0.125 2023-09-28 22:03:57,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=156120.0, ans=0.0 2023-09-28 22:03:57,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:58,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:04:01,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:01,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:04:03,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:03,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 22:04:03,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:04:06,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:06,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=156186.66666666666, ans=0.025 2023-09-28 22:04:07,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:08,367 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.63 vs. limit=15.0 2023-09-28 22:04:09,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:11,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:04:12,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:12,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:12,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 22:04:13,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=156186.66666666666, ans=0.125 2023-09-28 22:04:14,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 22:04:15,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:04:15,840 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 22:04:15,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:17,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:04:19,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 22:04:19,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:04:19,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 22:04:19,549 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 22:04:19,550 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 22:04:19,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 22:04:21,002 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.209e+02 2.562e+02 3.022e+02 4.431e+02, threshold=5.124e+02, percent-clipped=0.0 2023-09-28 22:04:22,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:22,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:04:22,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:04:24,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:24,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:04:27,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:27,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:28,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.94 vs. limit=22.5 2023-09-28 22:04:34,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=156253.33333333334, ans=0.1 2023-09-28 22:04:37,535 INFO [train.py:1039] (2/4) Epoch 5, batch 2200, loss[loss=0.2301, simple_loss=0.2894, pruned_loss=0.08544, over 22053.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3078, pruned_loss=0.09548, over 4720552.41 frames. ], batch size: 48, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:04:37,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:04:37,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 22:04:42,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:04:45,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=156320.0, ans=0.0 2023-09-28 22:04:47,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:49,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:04:49,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:50,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:04:54,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:54,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:54,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 22:04:59,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=156386.66666666666, ans=0.125 2023-09-28 22:05:00,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 22:05:00,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=156386.66666666666, ans=0.125 2023-09-28 22:05:01,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:05:04,307 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:05:08,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 22:05:10,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:11,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:11,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:05:16,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:05:18,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 22:05:18,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=156453.33333333334, ans=0.0 2023-09-28 22:05:22,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:05:22,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:22,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:05:25,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:05:27,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:27,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=156520.0, ans=0.0 2023-09-28 22:05:28,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:05:30,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:33,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 22:05:34,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:36,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 22:05:37,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:37,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:05:37,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:39,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:40,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:40,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:40,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:42,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:05:42,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:05:46,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:05:49,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 22:05:50,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:05:54,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:05:54,806 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 22:05:57,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:05:57,827 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 22:05:59,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:05:59,430 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 22:06:00,804 INFO [train.py:1039] (2/4) Epoch 5, batch 2250, loss[loss=0.2425, simple_loss=0.3054, pruned_loss=0.08977, over 24342.00 frames. ], tot_loss[loss=0.2507, simple_loss=0.3089, pruned_loss=0.09629, over 4709351.07 frames. ], batch size: 61, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:06:02,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:02,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:06:03,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:05,519 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 22:06:07,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:06:09,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:09,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=156653.33333333334, ans=0.125 2023-09-28 22:06:14,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:06:16,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:06:17,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:19,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:19,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:22,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 22:06:22,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:22,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:06:25,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 22:06:25,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:06:25,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:28,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:32,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:34,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:06:34,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:06:35,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 22:06:37,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:41,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:06:43,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=156786.66666666666, ans=0.1 2023-09-28 22:06:44,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:47,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:50,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:52,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:52,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=156853.33333333334, ans=0.2 2023-09-28 22:06:53,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:06:58,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:06:58,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=156853.33333333334, ans=0.5 2023-09-28 22:07:00,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:07:05,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:07:05,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:07:06,664 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.161e+02 2.544e+02 3.130e+02 4.790e+02, threshold=5.087e+02, percent-clipped=0.0 2023-09-28 22:07:06,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:07:13,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:07:16,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:07:16,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 22:07:16,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:18,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:07:21,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 22:07:22,986 INFO [train.py:1039] (2/4) Epoch 5, batch 2300, loss[loss=0.26, simple_loss=0.3256, pruned_loss=0.09725, over 24285.00 frames. ], tot_loss[loss=0.2515, simple_loss=0.3098, pruned_loss=0.09666, over 4717060.01 frames. ], batch size: 74, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:07:24,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:07:24,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:31,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:31,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:07:35,339 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 22:07:38,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:44,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:07:44,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:07:44,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:07:44,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:44,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 22:07:47,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:07:50,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:07:50,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:07:55,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:07:55,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=157120.0, ans=0.125 2023-09-28 22:07:58,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:08:01,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:03,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=157120.0, ans=0.0 2023-09-28 22:08:06,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:08:06,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:08:10,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:08:10,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=157120.0, ans=0.125 2023-09-28 22:08:13,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:08:17,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:08:18,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:08:18,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:08:18,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 22:08:23,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:08:23,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:24,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:24,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:08:24,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=157186.66666666666, ans=0.2 2023-09-28 22:08:26,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:26,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:08:26,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:08:28,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 22:08:28,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:08:28,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:29,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 22:08:35,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:08:39,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:08:45,635 INFO [train.py:1039] (2/4) Epoch 5, batch 2350, loss[loss=0.2693, simple_loss=0.3237, pruned_loss=0.1074, over 23568.00 frames. ], tot_loss[loss=0.2528, simple_loss=0.3107, pruned_loss=0.09748, over 4709548.39 frames. ], batch size: 93, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:08:45,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:45,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:08:45,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:08:49,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:08:49,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:08:49,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:08:50,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 22:08:57,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:08:57,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 22:09:02,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 22:09:04,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=157386.66666666666, ans=0.0 2023-09-28 22:09:05,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:09:08,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:08,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:08,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:09,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:10,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 22:09:15,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:09:20,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 22:09:20,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=157453.33333333334, ans=0.125 2023-09-28 22:09:21,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:23,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:09:24,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:09:27,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:09:29,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 22:09:31,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:09:33,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:33,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:33,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:09:37,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:09:39,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 22:09:39,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:09:43,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:43,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:09:44,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 22:09:44,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:09:49,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 22:09:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:09:51,235 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.798e+02 2.164e+02 2.489e+02 2.830e+02 4.285e+02, threshold=4.978e+02, percent-clipped=0.0 2023-09-28 22:09:55,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 22:09:58,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 22:09:59,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:59,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:09:59,866 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 22:09:59,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 22:10:02,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 22:10:04,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:10:08,325 INFO [train.py:1039] (2/4) Epoch 5, batch 2400, loss[loss=0.2298, simple_loss=0.2895, pruned_loss=0.08505, over 24456.00 frames. ], tot_loss[loss=0.2512, simple_loss=0.31, pruned_loss=0.09623, over 4713216.32 frames. ], batch size: 58, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:10:10,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:10:13,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:10:15,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:10:15,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 22:10:15,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 22:10:17,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=157653.33333333334, ans=0.125 2023-09-28 22:10:24,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:10:24,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:10:26,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 22:10:29,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:10:30,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:31,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 22:10:33,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=157720.0, ans=0.125 2023-09-28 22:10:36,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:36,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=157720.0, ans=0.2 2023-09-28 22:10:38,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 22:10:43,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:10:43,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=157786.66666666666, ans=0.0 2023-09-28 22:10:48,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 22:10:49,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:10:52,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:56,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:10:56,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 22:10:58,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:11:06,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:08,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:11,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:13,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:11:13,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:11:13,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:11:13,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:14,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:14,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:11:19,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:11:19,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:11:19,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 22:11:22,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 22:11:23,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:11:24,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:25,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 22:11:25,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 22:11:25,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 22:11:25,173 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 22:11:26,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 22:11:26,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:11:26,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=157920.0, ans=0.1 2023-09-28 22:11:28,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:28,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:28,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=157920.0, ans=0.2 2023-09-28 22:11:29,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=157920.0, ans=0.0 2023-09-28 22:11:31,667 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 22:11:31,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:33,741 INFO [train.py:1039] (2/4) Epoch 5, batch 2450, loss[loss=0.222, simple_loss=0.2864, pruned_loss=0.07883, over 24463.00 frames. ], tot_loss[loss=0.25, simple_loss=0.3085, pruned_loss=0.09576, over 4718015.92 frames. ], batch size: 58, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:11:33,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:11:37,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:11:37,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:39,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=157986.66666666666, ans=0.125 2023-09-28 22:11:40,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:40,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:41,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 22:11:48,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:48,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:51,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:11:51,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:11:51,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:11:51,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 22:11:55,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=158053.33333333334, ans=0.125 2023-09-28 22:11:58,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:59,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:11:59,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:12:03,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:12:05,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:06,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:06,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:12:07,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=158120.0, ans=0.125 2023-09-28 22:12:10,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 22:12:10,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:12:18,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:19,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:12:19,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:19,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:12:21,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:21,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:12:23,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 22:12:26,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:28,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:12:31,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:12:31,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:37,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:12:37,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 22:12:38,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:12:39,969 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.190e+02 2.639e+02 3.152e+02 5.360e+02, threshold=5.279e+02, percent-clipped=2.0 2023-09-28 22:12:40,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:12:40,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 22:12:40,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:12:40,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:12:45,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:12:47,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=158253.33333333334, ans=0.125 2023-09-28 22:12:48,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:48,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:12:51,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 22:12:53,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:12:55,802 INFO [train.py:1039] (2/4) Epoch 5, batch 2500, loss[loss=0.272, simple_loss=0.3373, pruned_loss=0.1033, over 23659.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.3073, pruned_loss=0.0949, over 4712519.50 frames. ], batch size: 85, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:12:59,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=158320.0, ans=0.125 2023-09-28 22:13:01,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:11,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:13:11,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:13:13,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:13,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 22:13:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:13:21,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:21,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:13:21,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:13:23,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 22:13:23,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:24,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:25,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 22:13:25,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:26,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 22:13:26,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:31,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:13:31,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:34,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:13:36,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 22:13:36,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:13:36,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=158453.33333333334, ans=0.125 2023-09-28 22:13:39,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:43,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:48,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:51,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:13:54,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=158520.0, ans=0.09899494936611666 2023-09-28 22:13:56,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:13:58,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 22:13:58,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:58,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:01,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:14:01,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:14:01,705 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 22:14:01,706 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 22:14:01,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 22:14:06,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:09,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 22:14:09,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 22:14:11,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:14:11,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 22:14:16,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 22:14:20,165 INFO [train.py:1039] (2/4) Epoch 5, batch 2550, loss[loss=0.2523, simple_loss=0.3124, pruned_loss=0.09607, over 23441.00 frames. ], tot_loss[loss=0.2483, simple_loss=0.3072, pruned_loss=0.09465, over 4718947.18 frames. ], batch size: 93, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:14:20,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:21,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:14:23,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:14:25,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:26,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 22:14:28,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:14:31,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 22:14:32,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:14:34,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:37,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:14:37,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 22:14:37,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:14:38,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:14:38,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:43,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:14:43,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 22:14:43,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:43,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:43,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 22:14:59,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:15:04,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:04,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:04,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:15:07,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:15:12,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:15:16,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:15:16,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:15:16,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:15:17,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:15:17,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:15:21,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:21,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:24,943 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.422e+02 2.826e+02 3.525e+02 6.917e+02, threshold=5.653e+02, percent-clipped=3.0 2023-09-28 22:15:29,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:15:30,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 22:15:30,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:15:30,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:30,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:15:33,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:15:34,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:39,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:15:42,640 INFO [train.py:1039] (2/4) Epoch 5, batch 2600, loss[loss=0.3395, simple_loss=0.368, pruned_loss=0.1555, over 19466.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3077, pruned_loss=0.09533, over 4710369.64 frames. ], batch size: 388, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:15:42,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:45,754 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 22:15:47,454 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 22:15:47,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:15:47,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=158986.66666666666, ans=0.2 2023-09-28 22:15:48,875 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 22:15:48,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 22:15:49,014 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 22:15:52,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:52,070 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 22:15:53,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 22:15:55,207 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 22:15:56,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:15:58,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 22:16:01,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 22:16:02,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:16:02,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 22:16:04,587 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 22:16:04,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 22:16:15,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:15,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:16,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:16,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 22:16:17,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=159120.0, ans=0.5 2023-09-28 22:16:19,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:16:23,904 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 22:16:28,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:28,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:30,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 22:16:31,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:31,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:33,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 22:16:37,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:16:37,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:16:38,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:42,552 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 22:16:42,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:42,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:16:47,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:48,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:16:50,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 22:16:50,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:53,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:16:53,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:16:59,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 22:17:00,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:02,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:17:04,015 INFO [train.py:1039] (2/4) Epoch 5, batch 2650, loss[loss=0.2628, simple_loss=0.3191, pruned_loss=0.1032, over 23310.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3085, pruned_loss=0.09552, over 4721879.94 frames. ], batch size: 105, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:17:04,709 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.98 vs. limit=15.0 2023-09-28 22:17:09,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 22:17:09,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:10,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:17:12,811 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 22:17:12,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:15,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:17,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:17:18,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:17:21,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:17:22,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 22:17:22,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:17:22,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:17:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 22:17:28,200 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 22:17:31,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:32,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 22:17:32,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:17:32,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 22:17:38,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:38,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:17:38,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:39,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:42,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 22:17:42,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 22:17:47,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:17:50,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 22:17:52,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:52,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:52,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:17:53,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:53,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:55,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:58,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:17:59,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:59,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:18:02,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:18:03,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=159520.0, ans=0.2 2023-09-28 22:18:04,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:05,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:18:05,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:08,813 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.817e+02 2.225e+02 2.541e+02 3.251e+02 5.495e+02, threshold=5.083e+02, percent-clipped=0.0 2023-09-28 22:18:08,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:18:08,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:18:12,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:14,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:18:14,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:14,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 22:18:17,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:18:20,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:22,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:24,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:24,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:18:25,833 INFO [train.py:1039] (2/4) Epoch 5, batch 2700, loss[loss=0.2375, simple_loss=0.285, pruned_loss=0.09494, over 23579.00 frames. ], tot_loss[loss=0.2523, simple_loss=0.3103, pruned_loss=0.09717, over 4711670.86 frames. ], batch size: 256, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:18:25,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:26,855 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.45 vs. limit=22.5 2023-09-28 22:18:28,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:18:28,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 22:18:31,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:18:32,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:18:35,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:18:35,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:35,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:37,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=159653.33333333334, ans=0.0 2023-09-28 22:18:37,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=159653.33333333334, ans=0.2 2023-09-28 22:18:38,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:18:38,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:38,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:18:38,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:18:38,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 22:18:40,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:18:40,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=159720.0, ans=0.125 2023-09-28 22:18:41,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:18:43,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:18:43,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:46,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:18:49,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 22:18:49,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:18:52,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=159720.0, ans=10.0 2023-09-28 22:18:54,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:18:54,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:01,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:19:01,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:19:02,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:19:02,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:19:04,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=159786.66666666666, ans=0.0 2023-09-28 22:19:06,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:06,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=159786.66666666666, ans=0.1 2023-09-28 22:19:07,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:07,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:19:07,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:19:12,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:12,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:19:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:19:21,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:19:23,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:19:23,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:30,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:30,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:31,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:33,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:35,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:35,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:19:37,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:19:38,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:38,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:41,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 22:19:41,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:44,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:19:44,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 22:19:47,726 INFO [train.py:1039] (2/4) Epoch 5, batch 2750, loss[loss=0.2433, simple_loss=0.3139, pruned_loss=0.08638, over 23971.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3086, pruned_loss=0.09554, over 4727008.56 frames. ], batch size: 80, lr: 1.99e-02, grad_scale: 16.0 2023-09-28 22:19:47,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 22:19:47,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:49,929 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.19 vs. limit=15.0 2023-09-28 22:19:54,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:19:54,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:57,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:57,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:19:59,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:01,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:01,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:20:02,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:20:02,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:02,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 22:20:02,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:20:02,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:20:03,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=159986.66666666666, ans=0.1 2023-09-28 22:20:11,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 22:20:12,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:20:12,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:14,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:14,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:20:15,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:20:17,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:20:17,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:19,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:20,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=160053.33333333334, ans=0.125 2023-09-28 22:20:23,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:20:23,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=160120.0, ans=0.0 2023-09-28 22:20:25,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:20:25,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:20:26,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:26,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=160120.0, ans=0.1 2023-09-28 22:20:28,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:20:30,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=160120.0, ans=0.125 2023-09-28 22:20:36,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:38,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:20:38,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:42,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:42,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:20:42,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:20:50,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:20:50,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:50,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 22:20:51,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=160186.66666666666, ans=0.0 2023-09-28 22:20:54,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:56,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 22:20:57,667 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.252e+02 2.534e+02 3.037e+02 4.293e+02, threshold=5.069e+02, percent-clipped=0.0 2023-09-28 22:20:58,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=160253.33333333334, ans=0.1 2023-09-28 22:21:02,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:21:04,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:21:04,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 22:21:05,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:21:08,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:21:09,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 22:21:09,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:21:12,455 INFO [train.py:1039] (2/4) Epoch 5, batch 2800, loss[loss=0.251, simple_loss=0.3049, pruned_loss=0.09851, over 23599.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.3072, pruned_loss=0.09497, over 4733838.95 frames. ], batch size: 149, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:21:12,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:21:12,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:14,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:21:14,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 22:21:14,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:16,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:18,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:18,643 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 22:21:18,644 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 22:21:21,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:25,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:21:25,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:21:28,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:21:31,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 22:21:33,792 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.12 vs. limit=15.0 2023-09-28 22:21:34,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:21:35,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 22:21:37,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:37,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:21:37,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:21:40,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:21:42,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:42,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:21:44,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:21:54,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:21:56,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:58,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:01,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:22:01,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:03,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=160520.0, ans=0.125 2023-09-28 22:22:04,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:04,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 22:22:05,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:06,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:06,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:22:09,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=160520.0, ans=0.125 2023-09-28 22:22:10,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:10,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:11,088 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:22:13,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:16,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:22:16,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:16,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:22:16,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=160520.0, ans=0.2 2023-09-28 22:22:17,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:22:17,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:22:17,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:22:19,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 22:22:19,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:21,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:22:21,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:25,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 22:22:25,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:25,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:22:26,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:22:28,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 22:22:33,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:34,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:22:34,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:22:35,941 INFO [train.py:1039] (2/4) Epoch 5, batch 2850, loss[loss=0.2236, simple_loss=0.2969, pruned_loss=0.07511, over 24471.00 frames. ], tot_loss[loss=0.2468, simple_loss=0.3052, pruned_loss=0.09421, over 4710580.45 frames. ], batch size: 66, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:22:37,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:42,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:22:42,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:22:43,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:45,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:45,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:47,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:22:47,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 22:22:53,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 22:22:53,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:53,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=160720.0, ans=0.0 2023-09-28 22:22:54,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.68 vs. limit=6.0 2023-09-28 22:22:55,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 22:22:56,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:59,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 22:23:01,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 22:23:02,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:03,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.31 vs. limit=15.0 2023-09-28 22:23:15,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:16,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:18,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:23:18,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:23:18,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:23:19,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:23:21,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:23:22,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 22:23:25,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:23:25,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:23:27,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:27,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:30,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:30,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:33,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:34,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:37,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:23:38,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:40,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:41,938 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.140e+02 2.376e+02 2.803e+02 4.746e+02, threshold=4.753e+02, percent-clipped=0.0 2023-09-28 22:23:42,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:23:46,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:23:48,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 22:23:48,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 22:23:49,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:23:50,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:50,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 22:23:51,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:23:51,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:51,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:53,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:23:53,170 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 22:23:53,228 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 22:23:53,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:23:54,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:55,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=160986.66666666666, ans=0.0 2023-09-28 22:23:56,234 INFO [train.py:1039] (2/4) Epoch 5, batch 2900, loss[loss=0.2783, simple_loss=0.3179, pruned_loss=0.1194, over 22842.00 frames. ], tot_loss[loss=0.2463, simple_loss=0.3054, pruned_loss=0.09362, over 4717645.37 frames. ], batch size: 322, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:23:58,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:23:58,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:58,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:23:59,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=160986.66666666666, ans=0.07 2023-09-28 22:24:00,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 22:24:06,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:06,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 22:24:07,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 22:24:09,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:24:09,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:24:12,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:12,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:24:13,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.29 vs. limit=15.0 2023-09-28 22:24:15,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:24:15,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:17,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:24:18,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 22:24:19,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:24:20,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:23,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 22:24:24,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 22:24:27,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:24:27,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 22:24:27,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:24:31,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:24:31,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:24:33,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=161120.0, ans=0.125 2023-09-28 22:24:34,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:34,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:39,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:24:42,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:24:43,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 22:24:45,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 22:24:45,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:24:48,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:24:48,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=161186.66666666666, ans=0.125 2023-09-28 22:24:51,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 22:24:51,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:24:57,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:25:00,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=161253.33333333334, ans=0.1 2023-09-28 22:25:01,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.06 vs. limit=15.0 2023-09-28 22:25:07,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:25:07,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:25:09,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 22:25:13,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:13,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 22:25:15,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:15,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:25:20,058 INFO [train.py:1039] (2/4) Epoch 5, batch 2950, loss[loss=0.2346, simple_loss=0.2983, pruned_loss=0.0854, over 24533.00 frames. ], tot_loss[loss=0.2469, simple_loss=0.3064, pruned_loss=0.09372, over 4719644.21 frames. ], batch size: 60, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:25:21,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:22,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=161320.0, ans=0.0 2023-09-28 22:25:23,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 22:25:25,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:25,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:26,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:25:28,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:25:29,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 22:25:29,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 22:25:31,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:25:31,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:38,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:25:40,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:25:45,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:25:45,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:25:47,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=161386.66666666666, ans=0.0 2023-09-28 22:25:49,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:25:49,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:25:49,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:25:56,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 22:25:57,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 22:25:59,160 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 22:25:59,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:26:02,244 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 22:26:03,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 22:26:03,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:26:03,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:26:03,907 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 22:26:05,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:26:06,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 22:26:08,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:26:09,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:26:11,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:14,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:26:14,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:14,530 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 22:26:14,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:14,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 22:26:16,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=161520.0, ans=0.1 2023-09-28 22:26:21,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:23,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:26:24,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 22:26:24,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:26:25,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 22:26:27,625 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.929e+02 2.317e+02 2.702e+02 3.273e+02 4.611e+02, threshold=5.405e+02, percent-clipped=0.0 2023-09-28 22:26:29,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:32,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:26:32,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:26:34,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:34,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:26:34,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:26:35,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:35,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:26:37,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:26:38,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:39,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:26:39,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=161586.66666666666, ans=0.125 2023-09-28 22:26:40,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:40,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 22:26:41,948 INFO [train.py:1039] (2/4) Epoch 5, batch 3000, loss[loss=0.2606, simple_loss=0.3101, pruned_loss=0.1055, over 23627.00 frames. ], tot_loss[loss=0.2483, simple_loss=0.3077, pruned_loss=0.09445, over 4716775.12 frames. ], batch size: 232, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:26:41,948 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 22:26:57,270 INFO [train.py:1071] (2/4) Epoch 5, validation: loss=0.3788, simple_loss=0.3301, pruned_loss=0.2137, over 1125622.00 frames. 2023-09-28 22:26:57,272 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 22:26:57,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:59,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:00,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:27:04,953 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 22:27:05,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 22:27:07,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:27:07,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:27:08,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 22:27:08,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:12,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:27:22,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:27:30,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 22:27:32,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:27:35,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:27:35,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:36,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=161786.66666666666, ans=0.125 2023-09-28 22:27:37,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:27:38,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:39,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 22:27:42,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 22:27:43,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:27:45,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:27:45,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=161853.33333333334, ans=0.125 2023-09-28 22:27:46,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:27:46,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:47,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=161853.33333333334, ans=0.125 2023-09-28 22:27:48,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:48,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:27:52,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:27:52,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:52,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:27:55,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:57,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 22:27:59,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:28:00,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:01,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:28:04,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.75 vs. limit=22.5 2023-09-28 22:28:04,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:04,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:06,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:28:08,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 22:28:08,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:08,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 22:28:09,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:28:09,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=161920.0, ans=0.0 2023-09-28 22:28:11,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 22:28:13,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=161920.0, ans=0.0 2023-09-28 22:28:14,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:17,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:28:17,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 22:28:19,146 INFO [train.py:1039] (2/4) Epoch 5, batch 3050, loss[loss=0.2444, simple_loss=0.3058, pruned_loss=0.09148, over 23381.00 frames. ], tot_loss[loss=0.249, simple_loss=0.3084, pruned_loss=0.09479, over 4712701.00 frames. ], batch size: 105, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:28:19,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 22:28:19,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:28:20,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:28:20,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:20,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:28:22,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:22,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:28:25,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 22:28:25,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=161986.66666666666, ans=0.125 2023-09-28 22:28:27,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:28:28,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:30,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:28:33,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:38,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 22:28:42,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=162053.33333333334, ans=0.125 2023-09-28 22:28:45,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 22:28:45,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 22:28:45,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:28:47,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=162053.33333333334, ans=0.1 2023-09-28 22:28:51,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:28:52,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:52,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:54,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:28:57,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:28:57,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:57,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:58,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:58,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:00,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:02,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:03,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:05,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 22:29:07,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:07,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:29:10,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:29:11,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:29:11,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:11,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:18,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:18,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:25,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:25,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:29:25,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:26,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.60 vs. limit=6.0 2023-09-28 22:29:27,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:27,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:29:28,683 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.152e+02 2.476e+02 2.829e+02 3.891e+02, threshold=4.952e+02, percent-clipped=0.0 2023-09-28 22:29:28,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:29:30,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 22:29:31,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:31,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:33,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 22:29:34,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:41,003 INFO [train.py:1039] (2/4) Epoch 5, batch 3100, loss[loss=0.2099, simple_loss=0.2761, pruned_loss=0.07184, over 24415.00 frames. ], tot_loss[loss=0.2479, simple_loss=0.3073, pruned_loss=0.09426, over 4709159.56 frames. ], batch size: 58, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:29:42,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:44,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:29:48,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:29:49,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 22:29:50,551 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.51 vs. limit=15.0 2023-09-28 22:29:51,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 22:29:53,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 22:29:53,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:29:58,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:58,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:00,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:30:03,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:07,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 22:30:12,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:30:13,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:14,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:14,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:30:14,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:30:18,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:30:18,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 22:30:18,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:30:20,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:20,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 22:30:22,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:30:25,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:30:27,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 22:30:27,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 22:30:30,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:31,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:34,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:34,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:35,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:30:37,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:30:37,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:30:38,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:30:38,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:30:38,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:38,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 22:30:44,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:46,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 22:30:49,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:30:49,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 22:30:51,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:51,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:53,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 22:30:58,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=162586.66666666666, ans=0.0 2023-09-28 22:31:04,535 INFO [train.py:1039] (2/4) Epoch 5, batch 3150, loss[loss=0.2465, simple_loss=0.3187, pruned_loss=0.08716, over 24531.00 frames. ], tot_loss[loss=0.2467, simple_loss=0.3056, pruned_loss=0.09387, over 4704011.78 frames. ], batch size: 66, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:31:04,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 22:31:05,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=162653.33333333334, ans=0.1 2023-09-28 22:31:08,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:08,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:10,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:31:10,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:31:10,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 22:31:11,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:11,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:31:12,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=162653.33333333334, ans=0.1 2023-09-28 22:31:13,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 22:31:15,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:16,817 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 22:31:18,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 22:31:18,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:31:20,053 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 22:31:21,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:31:24,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 22:31:25,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 22:31:25,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 22:31:25,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:25,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:26,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:30,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 22:31:31,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:31,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:33,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:36,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:31:40,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 22:31:42,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:31:43,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:31:45,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:45,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 22:31:48,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 22:31:49,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:31:50,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:31:51,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:31:51,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:51,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:31:53,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:31:53,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:31:54,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 22:31:54,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:31:54,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:31:57,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:31:57,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:57,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 22:31:59,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:01,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 22:32:01,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:03,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 22:32:05,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 22:32:06,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:32:07,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.89 vs. limit=22.5 2023-09-28 22:32:08,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:09,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 22:32:09,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:32:11,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:32:13,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:32:15,474 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.257e+02 2.534e+02 2.930e+02 4.234e+02, threshold=5.067e+02, percent-clipped=0.0 2023-09-28 22:32:15,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:17,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:32:21,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:32:21,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:24,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 22:32:27,685 INFO [train.py:1039] (2/4) Epoch 5, batch 3200, loss[loss=0.2447, simple_loss=0.3206, pruned_loss=0.0844, over 24308.00 frames. ], tot_loss[loss=0.2454, simple_loss=0.3045, pruned_loss=0.09317, over 4718521.42 frames. ], batch size: 74, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:32:30,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:32:30,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:32:31,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=162986.66666666666, ans=0.125 2023-09-28 22:32:34,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:36,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:32:36,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 22:32:39,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:44,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:32:48,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:58,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:33:01,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=163120.0, ans=0.0 2023-09-28 22:33:07,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 22:33:07,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:33:10,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=163120.0, ans=0.125 2023-09-28 22:33:11,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 22:33:13,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:33:16,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:33:16,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:33:16,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=163186.66666666666, ans=0.1 2023-09-28 22:33:17,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:33:22,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 22:33:24,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:33:26,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 22:33:30,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 22:33:33,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:33:39,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:39,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:33:39,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:40,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 22:33:40,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:33:44,114 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:33:45,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:33:47,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 22:33:47,574 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:33:48,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 22:33:48,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 22:33:49,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 22:33:50,393 INFO [train.py:1039] (2/4) Epoch 5, batch 3250, loss[loss=0.2678, simple_loss=0.3114, pruned_loss=0.1121, over 23760.00 frames. ], tot_loss[loss=0.2463, simple_loss=0.3053, pruned_loss=0.09368, over 4726171.62 frames. ], batch size: 212, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:33:52,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:33:53,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:33:53,774 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 22:33:55,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:33:55,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:33:56,774 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 22:34:02,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:34:05,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:13,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:34:13,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 22:34:13,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:14,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:34:14,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:16,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:17,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:34:20,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:20,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:34:20,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:22,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:34:25,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:26,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:28,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:29,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:30,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:32,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:32,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:34:40,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 22:34:40,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:40,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:34:41,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:43,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:34:48,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:34:54,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:34:55,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:55,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 22:34:55,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:34:55,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:34:56,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:59,772 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.181e+02 2.539e+02 2.910e+02 4.275e+02, threshold=5.078e+02, percent-clipped=0.0 2023-09-28 22:34:59,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 22:35:00,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 22:35:00,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:35:00,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=163586.66666666666, ans=0.125 2023-09-28 22:35:01,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:01,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:03,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:35:03,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:08,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:08,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:09,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=163586.66666666666, ans=0.2 2023-09-28 22:35:11,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 22:35:11,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:13,132 INFO [train.py:1039] (2/4) Epoch 5, batch 3300, loss[loss=0.2183, simple_loss=0.2971, pruned_loss=0.0698, over 24546.00 frames. ], tot_loss[loss=0.2471, simple_loss=0.306, pruned_loss=0.09411, over 4724252.99 frames. ], batch size: 71, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:35:13,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:35:13,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 22:35:16,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:35:16,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 22:35:19,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 22:35:19,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 22:35:19,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:21,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=163653.33333333334, ans=0.0 2023-09-28 22:35:22,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:24,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:35:24,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:26,879 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.63 vs. limit=22.5 2023-09-28 22:35:27,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:35:27,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:35:28,629 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-09-28 22:35:31,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:33,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:36,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 22:35:37,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:35:37,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:40,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:40,261 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 22:35:41,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:35:41,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:35:43,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:35:43,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:35:43,426 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 22:35:47,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:47,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:35:49,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=163786.66666666666, ans=0.125 2023-09-28 22:35:50,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:50,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 22:35:50,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 22:35:51,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:51,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:35:55,065 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 22:35:56,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 22:35:58,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:00,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 22:36:02,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:03,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:36:05,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:08,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:08,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:08,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:36:09,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:36:11,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:36:11,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:13,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:36:15,285 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 22:36:15,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 22:36:17,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:36:18,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:36:18,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:20,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:20,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:22,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:36:23,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:23,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:36:23,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:26,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:36:27,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=163920.0, ans=0.1 2023-09-28 22:36:28,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 22:36:28,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:28,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=163920.0, ans=0.0 2023-09-28 22:36:30,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:32,907 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.11 vs. limit=12.0 2023-09-28 22:36:33,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:36:33,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:35,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:36,543 INFO [train.py:1039] (2/4) Epoch 5, batch 3350, loss[loss=0.2458, simple_loss=0.304, pruned_loss=0.09376, over 23768.00 frames. ], tot_loss[loss=0.2496, simple_loss=0.3076, pruned_loss=0.09576, over 4721821.77 frames. ], batch size: 212, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:36:36,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:36,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:41,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:43,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:43,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=163986.66666666666, ans=0.02 2023-09-28 22:36:44,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:47,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:50,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:36:51,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:51,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:36:54,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 22:36:58,138 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 22:36:58,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:37:01,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 22:37:01,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 22:37:01,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:37:01,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:37:02,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:04,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 22:37:04,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:04,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:37:06,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:08,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:37:14,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:17,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:17,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:22,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:37:22,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:24,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:24,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:28,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:30,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 22:37:31,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:37:31,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 22:37:32,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:37:34,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 22:37:34,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:35,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:42,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:42,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 22:37:42,612 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.87 vs. limit=15.0 2023-09-28 22:37:44,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:37:46,089 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.765e+02 2.271e+02 2.616e+02 3.188e+02 4.875e+02, threshold=5.232e+02, percent-clipped=0.0 2023-09-28 22:37:46,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:37:47,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:37:49,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=164253.33333333334, ans=0.125 2023-09-28 22:37:50,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:37:52,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=164253.33333333334, ans=0.07 2023-09-28 22:37:53,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.07 vs. limit=15.0 2023-09-28 22:37:53,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 22:37:54,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:37:55,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:37:56,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:57,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 22:37:58,994 INFO [train.py:1039] (2/4) Epoch 5, batch 3400, loss[loss=0.2566, simple_loss=0.3055, pruned_loss=0.1039, over 23871.00 frames. ], tot_loss[loss=0.2513, simple_loss=0.3097, pruned_loss=0.09641, over 4715658.95 frames. ], batch size: 164, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:37:59,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:59,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 22:38:02,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:38:02,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=164320.0, ans=0.125 2023-09-28 22:38:03,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:38:05,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 22:38:10,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 22:38:10,512 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 22:38:10,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:10,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=164320.0, ans=0.0 2023-09-28 22:38:13,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=164386.66666666666, ans=0.125 2023-09-28 22:38:15,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:38:15,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:38:15,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:16,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:38:20,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.54 vs. limit=12.0 2023-09-28 22:38:22,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:22,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 22:38:27,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:38:30,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:32,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:32,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:38:38,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:38:42,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.14 vs. limit=15.0 2023-09-28 22:38:43,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 22:38:49,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 22:38:51,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:38:53,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:53,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:53,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:38:58,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:39:01,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:39:01,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:39:07,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:10,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 22:39:16,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:39:17,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=164586.66666666666, ans=0.125 2023-09-28 22:39:21,480 INFO [train.py:1039] (2/4) Epoch 5, batch 3450, loss[loss=0.243, simple_loss=0.2836, pruned_loss=0.1012, over 23585.00 frames. ], tot_loss[loss=0.2499, simple_loss=0.3086, pruned_loss=0.09566, over 4715072.28 frames. ], batch size: 256, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:39:21,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 22:39:25,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 22:39:27,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:39:28,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:39:28,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 22:39:31,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:36,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:39:39,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:39:39,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:39:41,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:39:41,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:43,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:43,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=164720.0, ans=0.125 2023-09-28 22:39:48,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 22:39:54,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 22:39:54,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:39:54,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:39:57,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:02,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 22:40:05,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:40:08,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=164786.66666666666, ans=0.2 2023-09-28 22:40:09,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:09,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:40:11,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:40:11,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=164853.33333333334, ans=0.035 2023-09-28 22:40:13,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:40:15,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 22:40:15,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:16,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:40:18,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:40:21,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 22:40:25,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:40:29,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:40:31,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:33,105 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.242e+02 2.572e+02 2.948e+02 4.937e+02, threshold=5.144e+02, percent-clipped=0.0 2023-09-28 22:40:33,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:39,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:40,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:40,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:40:40,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:45,225 INFO [train.py:1039] (2/4) Epoch 5, batch 3500, loss[loss=0.2352, simple_loss=0.2779, pruned_loss=0.09628, over 23687.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3064, pruned_loss=0.09503, over 4713491.31 frames. ], batch size: 232, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:40:45,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:50,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:40:50,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 22:40:52,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:40:56,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:40:58,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=164986.66666666666, ans=0.125 2023-09-28 22:41:00,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=165053.33333333334, ans=15.0 2023-09-28 22:41:00,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:41:00,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 22:41:06,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:41:07,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:41:09,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:41:09,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:09,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:41:11,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:11,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:13,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 22:41:16,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:16,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:41:19,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:22,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:22,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 22:41:22,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:26,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:29,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:41:29,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:29,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=165120.0, ans=0.2 2023-09-28 22:41:31,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:41:31,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:32,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 22:41:34,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 22:41:34,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 22:41:35,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:37,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:39,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:39,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:41:44,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:41:44,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:41:51,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:41:53,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 22:41:53,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 22:41:53,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:41:55,912 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.68 vs. limit=15.0 2023-09-28 22:41:56,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:41:56,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:41:56,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:58,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=165253.33333333334, ans=0.0 2023-09-28 22:41:59,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 22:42:01,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:42:03,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:42:03,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=165253.33333333334, ans=0.125 2023-09-28 22:42:04,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 22:42:06,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 22:42:07,719 INFO [train.py:1039] (2/4) Epoch 5, batch 3550, loss[loss=0.2299, simple_loss=0.299, pruned_loss=0.08041, over 24484.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.305, pruned_loss=0.09403, over 4723776.44 frames. ], batch size: 63, lr: 1.96e-02, grad_scale: 16.0 2023-09-28 22:42:07,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:09,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:42:09,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:11,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:14,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:42:25,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:26,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:42:28,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:30,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:42:32,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:33,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:42:33,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:42:36,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:36,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:42:36,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:37,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:42:38,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:42:38,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=165386.66666666666, ans=0.125 2023-09-28 22:42:40,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=165453.33333333334, ans=0.0 2023-09-28 22:42:43,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:42:43,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=165453.33333333334, ans=0.125 2023-09-28 22:42:44,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:46,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:42:46,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:46,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:42:46,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 22:42:46,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:46,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=165453.33333333334, ans=0.125 2023-09-28 22:42:50,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:52,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:42:57,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:57,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:59,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:02,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 22:43:02,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:43:04,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 22:43:04,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:43:07,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:43:07,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:43:07,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=165520.0, ans=0.0 2023-09-28 22:43:10,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.66 vs. limit=15.0 2023-09-28 22:43:11,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 22:43:12,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:16,949 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.31 vs. limit=15.0 2023-09-28 22:43:17,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:17,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 22:43:19,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:20,839 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.191e+02 2.567e+02 2.914e+02 4.741e+02, threshold=5.134e+02, percent-clipped=0.0 2023-09-28 22:43:24,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:43:28,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 22:43:33,473 INFO [train.py:1039] (2/4) Epoch 5, batch 3600, loss[loss=0.247, simple_loss=0.3209, pruned_loss=0.08655, over 24408.00 frames. ], tot_loss[loss=0.2461, simple_loss=0.3045, pruned_loss=0.09387, over 4719614.16 frames. ], batch size: 69, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:43:35,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 22:43:35,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:43:36,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:43:36,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:38,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:38,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:43:43,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:43:45,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:46,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:43:46,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:43:48,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:48,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 22:43:51,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:43:54,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:56,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:43:57,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=165720.0, ans=0.0 2023-09-28 22:43:59,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:01,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:44:01,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:44:03,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 22:44:05,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:44:08,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:44:08,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:44:11,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:14,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:14,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:14,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 22:44:16,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=165786.66666666666, ans=0.0 2023-09-28 22:44:22,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:24,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:44:24,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=165853.33333333334, ans=0.125 2023-09-28 22:44:25,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 22:44:30,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:44:34,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=165853.33333333334, ans=0.1 2023-09-28 22:44:35,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:37,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:45,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:44:45,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:44:45,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 22:44:46,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 22:44:47,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 22:44:50,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:50,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:44:51,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 22:44:51,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:44:53,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:44:53,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:55,045 INFO [train.py:1039] (2/4) Epoch 5, batch 3650, loss[loss=0.2258, simple_loss=0.2852, pruned_loss=0.08321, over 24435.00 frames. ], tot_loss[loss=0.2464, simple_loss=0.3055, pruned_loss=0.09368, over 4709376.30 frames. ], batch size: 58, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:44:55,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 22:44:55,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 22:44:58,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:58,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=165986.66666666666, ans=0.1 2023-09-28 22:44:59,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 22:45:04,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 22:45:05,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:45:09,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 22:45:10,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 22:45:15,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:15,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:45:15,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:45:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:45:19,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=166053.33333333334, ans=0.1 2023-09-28 22:45:20,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:45:20,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 22:45:21,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:45:21,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:23,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 22:45:25,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:45:25,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:45:25,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:28,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:45:28,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=166120.0, ans=0.125 2023-09-28 22:45:30,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 22:45:33,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 22:45:33,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:45:34,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 22:45:36,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:36,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:45:42,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:45:43,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:44,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:45:46,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:45:46,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:45:50,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:45:53,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:54,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:54,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:56,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:45:56,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:58,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:05,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=166253.33333333334, ans=0.0 2023-09-28 22:46:06,460 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.312e+02 2.641e+02 2.987e+02 4.263e+02, threshold=5.283e+02, percent-clipped=0.0 2023-09-28 22:46:06,557 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 22:46:11,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:11,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:12,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:46:12,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:12,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:46:14,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:17,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 22:46:17,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:17,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=166320.0, ans=0.125 2023-09-28 22:46:18,656 INFO [train.py:1039] (2/4) Epoch 5, batch 3700, loss[loss=0.2152, simple_loss=0.2913, pruned_loss=0.06958, over 24326.00 frames. ], tot_loss[loss=0.2474, simple_loss=0.3065, pruned_loss=0.09414, over 4718931.94 frames. ], batch size: 61, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:46:18,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:46:20,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=166320.0, ans=0.015 2023-09-28 22:46:20,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.91 vs. limit=10.0 2023-09-28 22:46:22,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:46:22,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:46:26,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:26,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 22:46:26,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:27,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:46:28,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:46:32,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:46:35,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:35,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:36,672 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.33 vs. limit=15.0 2023-09-28 22:46:37,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:46:37,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:38,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:46:40,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:41,960 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 22:46:43,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=166386.66666666666, ans=0.0 2023-09-28 22:46:50,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:46:51,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:46:51,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:46:53,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 22:46:53,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:46:55,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=166453.33333333334, ans=0.125 2023-09-28 22:46:56,175 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.82 vs. limit=15.0 2023-09-28 22:46:58,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:58,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 22:47:00,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:01,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:47:03,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:04,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:47:08,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:47:13,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:47:13,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 22:47:14,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:47:14,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 22:47:19,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:47:19,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:47:22,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:23,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 22:47:26,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:47:26,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:47:26,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:26,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:30,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:32,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 22:47:33,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 22:47:33,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:47:33,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:37,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:47:37,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:47:40,415 INFO [train.py:1039] (2/4) Epoch 5, batch 3750, loss[loss=0.2402, simple_loss=0.3178, pruned_loss=0.08132, over 24623.00 frames. ], tot_loss[loss=0.2495, simple_loss=0.3081, pruned_loss=0.09545, over 4718971.68 frames. ], batch size: 68, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:47:40,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:42,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:47:45,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:47:47,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 22:47:47,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:47:50,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:47:50,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 22:47:50,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=166653.33333333334, ans=0.125 2023-09-28 22:47:52,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:47:53,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:47:55,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=166720.0, ans=0.0 2023-09-28 22:47:58,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:01,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=166720.0, ans=0.2 2023-09-28 22:48:02,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:48:03,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:48:04,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=166720.0, ans=0.125 2023-09-28 22:48:05,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:48:08,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=166720.0, ans=0.2 2023-09-28 22:48:10,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:11,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 22:48:11,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=166720.0, ans=0.1 2023-09-28 22:48:12,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:15,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:16,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:20,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 22:48:22,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 22:48:23,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:25,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:25,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:31,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:31,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:48:37,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 22:48:40,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:40,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=166853.33333333334, ans=0.125 2023-09-28 22:48:42,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:48:42,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:48:46,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:48:49,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:48:51,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:48:52,779 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.387e+02 2.666e+02 3.325e+02 5.060e+02, threshold=5.333e+02, percent-clipped=0.0 2023-09-28 22:48:52,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:48:54,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:48:58,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:49:04,228 INFO [train.py:1039] (2/4) Epoch 5, batch 3800, loss[loss=0.2446, simple_loss=0.3196, pruned_loss=0.08485, over 24303.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.3066, pruned_loss=0.09425, over 4728026.72 frames. ], batch size: 74, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:49:04,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=166986.66666666666, ans=0.0 2023-09-28 22:49:05,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.50 vs. limit=15.0 2023-09-28 22:49:07,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:49:11,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:13,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:49:13,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 22:49:14,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:17,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:18,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:49:22,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 22:49:22,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:22,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:49:25,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:27,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:49:27,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:27,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 22:49:31,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:49:32,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:49:33,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:35,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:49:37,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:49:38,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:49:38,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:41,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:41,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:48,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:49:48,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 22:49:50,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:49:58,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:00,759 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:50:04,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:50:06,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 22:50:10,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 22:50:10,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:13,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:50:13,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:14,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 22:50:15,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.23 vs. limit=12.0 2023-09-28 22:50:19,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 22:50:19,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 22:50:20,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:20,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:25,458 INFO [train.py:1039] (2/4) Epoch 5, batch 3850, loss[loss=0.2077, simple_loss=0.2751, pruned_loss=0.07019, over 24364.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.3048, pruned_loss=0.09344, over 4741239.94 frames. ], batch size: 56, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:50:26,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:50:26,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:50:31,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:50:31,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=167320.0, ans=0.2 2023-09-28 22:50:32,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 22:50:34,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:50:34,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:37,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:50:38,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=167320.0, ans=0.125 2023-09-28 22:50:39,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:41,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:50:43,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 22:50:49,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:49,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=167386.66666666666, ans=0.0 2023-09-28 22:50:52,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:54,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:50:54,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:50:59,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:59,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:51:01,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:01,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:51:01,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:04,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:06,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:06,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:51:07,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 22:51:07,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 22:51:09,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:09,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:09,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=167453.33333333334, ans=0.0 2023-09-28 22:51:11,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:12,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:12,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 22:51:17,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 22:51:19,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:20,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 22:51:22,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:51:24,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=167520.0, ans=0.125 2023-09-28 22:51:28,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:30,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:35,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:35,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 22:51:37,561 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.762e+02 2.127e+02 2.578e+02 3.001e+02 5.626e+02, threshold=5.156e+02, percent-clipped=1.0 2023-09-28 22:51:37,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 22:51:39,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=167586.66666666666, ans=0.1 2023-09-28 22:51:40,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:40,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:45,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:51:45,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:51:45,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:51:46,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 22:51:47,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=23.78 vs. limit=22.5 2023-09-28 22:51:48,204 INFO [train.py:1039] (2/4) Epoch 5, batch 3900, loss[loss=0.2461, simple_loss=0.3034, pruned_loss=0.09441, over 23172.00 frames. ], tot_loss[loss=0.2445, simple_loss=0.3038, pruned_loss=0.0926, over 4734396.44 frames. ], batch size: 105, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:51:48,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:48,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 22:51:50,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:50,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:52,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:51:52,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:53,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:51:54,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=167653.33333333334, ans=0.0 2023-09-28 22:51:55,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:55,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:56,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:51:56,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 22:51:56,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:59,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:01,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:01,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:52:02,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:04,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:04,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:08,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:52:10,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 22:52:10,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:11,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 22:52:13,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:13,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 22:52:16,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 22:52:21,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:21,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:52:21,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:52:22,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:52:23,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=167786.66666666666, ans=0.2 2023-09-28 22:52:26,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:29,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:52:31,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:52:31,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:52:32,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:52:38,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:38,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:52:47,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:52:49,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:52:51,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=167853.33333333334, ans=0.125 2023-09-28 22:52:59,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:01,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.88 vs. limit=15.0 2023-09-28 22:53:01,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=167920.0, ans=15.0 2023-09-28 22:53:02,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:02,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 22:53:02,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 22:53:02,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:05,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 22:53:07,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:53:08,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 22:53:10,442 INFO [train.py:1039] (2/4) Epoch 5, batch 3950, loss[loss=0.2536, simple_loss=0.3013, pruned_loss=0.1029, over 23807.00 frames. ], tot_loss[loss=0.2454, simple_loss=0.3041, pruned_loss=0.09335, over 4729301.10 frames. ], batch size: 212, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:53:16,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:53:17,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 22:53:17,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:53:19,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:53:21,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:53:25,960 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.10 vs. limit=15.0 2023-09-28 22:53:28,014 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 22:53:28,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:28,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 22:53:28,242 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 22:53:29,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:32,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:34,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:53:34,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:36,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 22:53:39,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:53:39,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:39,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:53:41,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:53:42,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:53:43,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=168120.0, ans=0.0 2023-09-28 22:53:55,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:53:55,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:54:00,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 22:54:01,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=168186.66666666666, ans=0.1 2023-09-28 22:54:04,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=168186.66666666666, ans=0.125 2023-09-28 22:54:07,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 22:54:07,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 22:54:08,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:54:08,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:54:13,479 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.54 vs. limit=22.5 2023-09-28 22:54:17,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:54:17,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:54:19,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:54:20,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:54:20,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 22:54:21,767 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.253e+02 2.651e+02 3.133e+02 5.052e+02, threshold=5.303e+02, percent-clipped=0.0 2023-09-28 22:54:24,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:54:25,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:54:27,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=168253.33333333334, ans=0.1 2023-09-28 22:54:30,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 22:54:33,605 INFO [train.py:1039] (2/4) Epoch 5, batch 4000, loss[loss=0.2272, simple_loss=0.2865, pruned_loss=0.08394, over 17310.00 frames. ], tot_loss[loss=0.2457, simple_loss=0.3046, pruned_loss=0.09345, over 4727121.51 frames. ], batch size: 37, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:54:40,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:47,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:51,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:54:53,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:54:54,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:54,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 22:54:54,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:54:56,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 22:54:56,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:54:56,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 22:54:58,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:54:58,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=168386.66666666666, ans=0.0 2023-09-28 22:55:01,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:55:03,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:03,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:55:03,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:03,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:55:03,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=168386.66666666666, ans=0.1 2023-09-28 22:55:04,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:55:06,477 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 22:55:07,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:55:08,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.52 vs. limit=6.0 2023-09-28 22:55:09,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:11,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=168453.33333333334, ans=0.125 2023-09-28 22:55:13,042 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 22:55:13,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:55:13,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:22,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 22:55:22,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:22,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=168520.0, ans=0.0 2023-09-28 22:55:24,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.66 vs. limit=22.5 2023-09-28 22:55:25,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:55:25,812 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 22:55:27,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:55:27,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 22:55:27,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:55:28,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:30,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:55:32,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:55:32,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:55:32,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:34,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 22:55:34,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:35,943 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 22:55:40,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=168586.66666666666, ans=0.07 2023-09-28 22:55:41,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:55:44,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:55:48,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=168586.66666666666, ans=0.0 2023-09-28 22:55:49,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:55:49,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:49,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:55:51,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:56,172 INFO [train.py:1039] (2/4) Epoch 5, batch 4050, loss[loss=0.2134, simple_loss=0.2816, pruned_loss=0.07263, over 24319.00 frames. ], tot_loss[loss=0.2448, simple_loss=0.3044, pruned_loss=0.0926, over 4728317.49 frames. ], batch size: 61, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:55:56,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:56,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=168653.33333333334, ans=0.1 2023-09-28 22:55:57,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:55:59,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 22:55:59,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:56:01,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:02,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:56:04,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:06,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:06,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=168653.33333333334, ans=0.2 2023-09-28 22:56:10,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:13,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:13,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:56:16,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:56:16,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:56:20,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:21,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:24,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 22:56:27,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 22:56:27,339 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 22:56:30,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:56:32,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.87 vs. limit=15.0 2023-09-28 22:56:36,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 22:56:38,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:56:41,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:44,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:46,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:56:46,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:50,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:54,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 22:56:54,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:56:56,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:56:56,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 22:57:02,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:57:08,218 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.148e+02 2.565e+02 3.123e+02 5.245e+02, threshold=5.130e+02, percent-clipped=0.0 2023-09-28 22:57:08,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 22:57:08,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:08,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:57:12,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 22:57:12,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 22:57:12,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:15,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:15,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.94 vs. limit=6.0 2023-09-28 22:57:16,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:16,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:57:20,078 INFO [train.py:1039] (2/4) Epoch 5, batch 4100, loss[loss=0.2553, simple_loss=0.3055, pruned_loss=0.1025, over 23692.00 frames. ], tot_loss[loss=0.247, simple_loss=0.3063, pruned_loss=0.09387, over 4726538.71 frames. ], batch size: 232, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:57:23,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 22:57:24,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 22:57:27,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 22:57:27,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 22:57:29,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:29,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:29,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:29,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:57:31,556 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 22:57:35,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:35,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:57:35,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:37,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:57:40,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:57:41,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:41,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:57:41,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 22:57:43,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:43,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:57:43,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:45,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:57:45,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 22:57:48,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:57:48,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=169053.33333333334, ans=0.5 2023-09-28 22:57:51,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 22:57:53,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:53,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=169120.0, ans=0.0 2023-09-28 22:57:55,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:55,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 22:57:57,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:58,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:57:58,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:58:00,471 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.69 vs. limit=6.0 2023-09-28 22:58:01,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 22:58:01,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=169120.0, ans=0.0 2023-09-28 22:58:02,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:58:04,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:58:07,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 22:58:08,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:58:09,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:11,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:17,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:20,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:22,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:58:31,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:58:31,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:34,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:37,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:58:42,146 INFO [train.py:1039] (2/4) Epoch 5, batch 4150, loss[loss=0.217, simple_loss=0.2951, pruned_loss=0.06951, over 24485.00 frames. ], tot_loss[loss=0.2473, simple_loss=0.3064, pruned_loss=0.09411, over 4730998.33 frames. ], batch size: 69, lr: 1.94e-02, grad_scale: 32.0 2023-09-28 22:58:42,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=169320.0, ans=0.035 2023-09-28 22:58:43,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:43,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:58:46,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:58:47,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:58:49,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 22:58:49,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=169320.0, ans=0.0 2023-09-28 22:58:49,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=169320.0, ans=0.07 2023-09-28 22:58:50,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:50,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 22:58:50,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 22:58:52,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 22:58:53,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:57,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:58:57,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:58:58,083 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.91 vs. limit=6.0 2023-09-28 22:59:01,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:03,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:03,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:59:06,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:59:06,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:59:07,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:59:09,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.51 vs. limit=15.0 2023-09-28 22:59:13,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:17,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:17,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=169453.33333333334, ans=0.2 2023-09-28 22:59:19,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 22:59:22,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 22:59:22,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:59:23,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 22:59:23,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:59:23,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:26,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:28,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:30,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 22:59:33,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:59:34,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:59:35,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 22:59:36,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:37,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 22:59:40,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:59:41,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:44,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:44,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 22:59:44,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:45,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:59:46,562 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.62 vs. limit=15.0 2023-09-28 22:59:47,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:59:51,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 22:59:51,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:51,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:59:51,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:59:53,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 22:59:54,524 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.448e+02 2.857e+02 3.478e+02 5.752e+02, threshold=5.715e+02, percent-clipped=2.0 2023-09-28 22:59:54,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:54,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:59:54,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:56,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:57,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 22:59:57,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:00:01,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:00:01,821 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.85 vs. limit=12.0 2023-09-28 23:00:04,576 INFO [train.py:1039] (2/4) Epoch 5, batch 4200, loss[loss=0.2645, simple_loss=0.3074, pruned_loss=0.1108, over 23769.00 frames. ], tot_loss[loss=0.2474, simple_loss=0.3053, pruned_loss=0.09472, over 4713118.53 frames. ], batch size: 164, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:00:04,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 23:00:06,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:00:09,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:11,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:00:11,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:11,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:14,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 23:00:17,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 23:00:17,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:21,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:23,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:00:26,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:00:28,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:00:28,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:28,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 23:00:28,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:28,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:29,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:29,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:00:32,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:00:33,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=169720.0, ans=0.1 2023-09-28 23:00:35,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 23:00:35,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:35,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=169720.0, ans=0.07 2023-09-28 23:00:39,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:00:41,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:00:44,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:00:46,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:00:47,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:00:47,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 23:00:47,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:00:50,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:00:57,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:00:59,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:03,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:01:07,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 23:01:11,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:01:15,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:01:15,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:18,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.65 vs. limit=22.5 2023-09-28 23:01:19,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 23:01:26,901 INFO [train.py:1039] (2/4) Epoch 5, batch 4250, loss[loss=0.2124, simple_loss=0.2895, pruned_loss=0.0677, over 24505.00 frames. ], tot_loss[loss=0.2464, simple_loss=0.304, pruned_loss=0.09442, over 4709039.42 frames. ], batch size: 66, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:01:26,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:01:28,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:28,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:01:31,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:38,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:01:38,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 23:01:38,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:01:43,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:45,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:01:50,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:50,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:53,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:01:53,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:01:54,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:57,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:58,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:02:01,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:02:01,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:03,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 23:02:06,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 23:02:06,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:07,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=170120.0, ans=0.1 2023-09-28 23:02:08,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:08,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:02:10,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:02:10,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:11,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:15,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:02:16,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:02:16,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=170186.66666666666, ans=0.0 2023-09-28 23:02:20,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:22,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=170186.66666666666, ans=0.0 2023-09-28 23:02:23,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:23,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 23:02:23,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:02:25,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 23:02:26,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:02:26,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:02:28,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=170186.66666666666, ans=0.2 2023-09-28 23:02:29,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:29,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:02:33,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 23:02:35,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:02:35,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:02:37,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=170253.33333333334, ans=0.1 2023-09-28 23:02:38,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=170253.33333333334, ans=0.0 2023-09-28 23:02:39,735 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.222e+02 2.521e+02 2.962e+02 6.093e+02, threshold=5.043e+02, percent-clipped=1.0 2023-09-28 23:02:39,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:42,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:43,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:02:45,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:46,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:48,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:02:48,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=170320.0, ans=0.125 2023-09-28 23:02:49,894 INFO [train.py:1039] (2/4) Epoch 5, batch 4300, loss[loss=0.2082, simple_loss=0.277, pruned_loss=0.06973, over 18356.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.3037, pruned_loss=0.09389, over 4711351.58 frames. ], batch size: 40, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:02:49,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:02:49,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 23:02:51,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:55,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=170320.0, ans=0.125 2023-09-28 23:02:58,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:58,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:01,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:03:03,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=170320.0, ans=0.04949747468305833 2023-09-28 23:03:08,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:03:08,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 23:03:09,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:03:11,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:03:11,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:03:11,432 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 23:03:15,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:03:18,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:20,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 23:03:21,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:03:21,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 23:03:21,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=170453.33333333334, ans=0.125 2023-09-28 23:03:24,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:03:26,000 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.91 vs. limit=12.0 2023-09-28 23:03:26,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:03:28,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:03:28,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:28,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=170453.33333333334, ans=0.0 2023-09-28 23:03:30,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:03:31,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:31,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:03:31,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 23:03:33,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 23:03:35,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:03:37,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:37,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=170453.33333333334, ans=0.125 2023-09-28 23:03:38,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:03:38,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:38,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:38,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 23:03:38,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 23:03:40,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 23:03:41,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:03:41,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 23:03:41,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 23:03:46,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.71 vs. limit=15.0 2023-09-28 23:03:48,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:48,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=170520.0, ans=0.125 2023-09-28 23:03:50,141 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 23:03:52,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:03:52,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:03:52,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:55,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 23:03:57,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:57,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:57,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:03:57,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:03:57,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:04:00,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:04:03,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:03,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:04,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=170586.66666666666, ans=0.1 2023-09-28 23:04:05,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:04:07,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=170586.66666666666, ans=0.0 2023-09-28 23:04:10,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 23:04:10,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:04:10,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=170586.66666666666, ans=0.0 2023-09-28 23:04:13,724 INFO [train.py:1039] (2/4) Epoch 5, batch 4350, loss[loss=0.2604, simple_loss=0.3085, pruned_loss=0.1061, over 23681.00 frames. ], tot_loss[loss=0.2471, simple_loss=0.3056, pruned_loss=0.09424, over 4710108.21 frames. ], batch size: 232, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:04:15,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:17,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:17,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=170653.33333333334, ans=0.125 2023-09-28 23:04:22,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:04:22,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:04:27,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:04:30,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:34,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:04:34,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:04:34,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=170720.0, ans=0.0 2023-09-28 23:04:37,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:04:39,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:04:40,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:04:45,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 23:04:48,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:50,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:52,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=170786.66666666666, ans=0.125 2023-09-28 23:04:55,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:58,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 23:05:00,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:01,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:05:07,951 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 23:05:10,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:10,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:05:12,248 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 23:05:12,370 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 23:05:12,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:12,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:13,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:05:15,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:15,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:16,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:05:18,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 23:05:18,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:18,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:18,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:20,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 23:05:20,721 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 23:05:22,139 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 23:05:22,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 23:05:22,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=170920.0, ans=0.09899494936611666 2023-09-28 23:05:25,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:05:25,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:05:25,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:25,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:05:27,200 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 2.177e+02 2.511e+02 2.905e+02 5.033e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-28 23:05:28,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 23:05:31,959 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 23:05:31,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:35,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:35,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:36,621 INFO [train.py:1039] (2/4) Epoch 5, batch 4400, loss[loss=0.2095, simple_loss=0.2874, pruned_loss=0.06586, over 24478.00 frames. ], tot_loss[loss=0.2481, simple_loss=0.3063, pruned_loss=0.09494, over 4705660.91 frames. ], batch size: 63, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:05:36,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:37,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=170986.66666666666, ans=0.125 2023-09-28 23:05:40,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 23:05:40,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 23:05:41,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 23:05:42,004 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 23:05:43,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:05:43,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:45,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 23:05:48,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:50,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:50,101 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 23:05:54,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:54,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 23:05:56,744 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 23:05:59,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 23:06:02,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 23:06:02,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 23:06:02,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:03,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:03,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:05,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:06,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 23:06:06,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 23:06:07,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:08,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:06:08,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:10,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:11,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:11,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 23:06:11,958 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 23:06:16,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:25,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:26,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 23:06:29,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:06:33,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:06:35,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:06:35,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 23:06:37,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:06:37,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:06:37,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:06:37,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:06:42,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 23:06:46,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 23:06:48,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 23:06:48,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:48,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 23:06:50,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:06:52,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=171253.33333333334, ans=0.125 2023-09-28 23:06:52,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=171253.33333333334, ans=0.125 2023-09-28 23:06:55,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:06:57,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 23:07:00,189 INFO [train.py:1039] (2/4) Epoch 5, batch 4450, loss[loss=0.2649, simple_loss=0.3181, pruned_loss=0.1059, over 19126.00 frames. ], tot_loss[loss=0.2496, simple_loss=0.3079, pruned_loss=0.0956, over 4715544.28 frames. ], batch size: 42, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:07:01,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:07:03,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:04,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:07:11,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:11,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:07:15,134 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.40 vs. limit=8.0 2023-09-28 23:07:15,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:18,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:07:23,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:07:23,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:24,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 23:07:24,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:24,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:24,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:07:24,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:07:25,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=171386.66666666666, ans=0.125 2023-09-28 23:07:27,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:07:33,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:36,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:37,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:07:40,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:07:42,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 23:07:42,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 23:07:42,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:07:45,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:45,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 23:07:51,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:07:54,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:54,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 23:07:54,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:54,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:07:54,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:07:54,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=171520.0, ans=0.025 2023-09-28 23:07:55,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:58,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:08:03,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:08:03,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 23:08:05,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:08:08,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:10,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:08:11,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:11,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:08:12,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=171586.66666666666, ans=0.1 2023-09-28 23:08:13,318 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.374e+02 2.783e+02 3.317e+02 5.756e+02, threshold=5.567e+02, percent-clipped=2.0 2023-09-28 23:08:15,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:08:15,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=171586.66666666666, ans=0.2 2023-09-28 23:08:16,039 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.27 vs. limit=15.0 2023-09-28 23:08:18,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 23:08:20,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:08:23,519 INFO [train.py:1039] (2/4) Epoch 5, batch 4500, loss[loss=0.2118, simple_loss=0.2734, pruned_loss=0.07509, over 24601.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.3071, pruned_loss=0.0951, over 4710819.67 frames. ], batch size: 60, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:08:25,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:26,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 23:08:26,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 23:08:28,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:32,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.51 vs. limit=12.0 2023-09-28 23:08:33,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:33,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:33,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:08:34,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:08:34,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:34,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:47,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:48,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:08:52,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:52,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:08:52,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=171720.0, ans=0.0 2023-09-28 23:08:55,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:08:56,133 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.17 vs. limit=10.0 2023-09-28 23:09:01,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:09:06,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:09:09,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=171786.66666666666, ans=0.125 2023-09-28 23:09:11,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:09:14,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:09:14,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 23:09:14,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:16,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:09:21,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:09:21,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 23:09:21,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:09:21,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:24,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=171853.33333333334, ans=0.5 2023-09-28 23:09:28,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:09:28,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:09:31,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:33,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:09:34,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:09:35,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 23:09:37,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 23:09:37,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 23:09:40,411 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.34 vs. limit=15.0 2023-09-28 23:09:41,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 23:09:44,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 23:09:44,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=171986.66666666666, ans=0.125 2023-09-28 23:09:46,069 INFO [train.py:1039] (2/4) Epoch 5, batch 4550, loss[loss=0.2367, simple_loss=0.3135, pruned_loss=0.07995, over 24610.00 frames. ], tot_loss[loss=0.2483, simple_loss=0.306, pruned_loss=0.09531, over 4700408.50 frames. ], batch size: 73, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:09:46,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:09:49,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:51,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:54,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:09:59,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:10:02,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:10:02,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:02,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:10:02,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:05,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=172053.33333333334, ans=0.125 2023-09-28 23:10:06,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:08,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:10:10,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:12,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 23:10:14,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 23:10:14,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:10:15,775 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.15 vs. limit=15.0 2023-09-28 23:10:16,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 23:10:20,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 23:10:21,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:24,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 23:10:27,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:10:30,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:10:33,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 23:10:37,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:40,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:40,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:42,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:42,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=172186.66666666666, ans=0.1 2023-09-28 23:10:44,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 23:10:44,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 23:10:44,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:10:45,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 23:10:48,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 23:10:48,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:49,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:49,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:51,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:51,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:10:52,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:10:54,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 23:10:55,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:55,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:10:56,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 23:10:57,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:10:57,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 23:10:58,980 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.022e+02 2.307e+02 2.730e+02 4.696e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-28 23:11:00,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:11:00,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:11:03,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=172253.33333333334, ans=0.0 2023-09-28 23:11:04,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:11:04,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:11:05,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:11:05,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:11:07,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:11:08,883 INFO [train.py:1039] (2/4) Epoch 5, batch 4600, loss[loss=0.2406, simple_loss=0.2998, pruned_loss=0.09067, over 23367.00 frames. ], tot_loss[loss=0.2466, simple_loss=0.3039, pruned_loss=0.09471, over 4687145.33 frames. ], batch size: 119, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:11:11,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:12,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:11:16,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:11:16,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:11:16,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:16,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=172320.0, ans=0.2 2023-09-28 23:11:19,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 23:11:21,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:11:22,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=172320.0, ans=0.125 2023-09-28 23:11:25,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:11:25,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:27,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:34,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 23:11:36,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:39,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:42,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:11:45,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:51,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 23:11:51,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:11:53,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:11:58,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:59,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:12:01,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.67 vs. limit=15.0 2023-09-28 23:12:02,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:12:04,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=172520.0, ans=0.125 2023-09-28 23:12:05,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 23:12:05,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:12:10,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:11,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:12:13,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:13,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 23:12:14,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:15,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 23:12:15,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:15,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=172586.66666666666, ans=0.2 2023-09-28 23:12:16,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:18,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:18,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:12:18,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=172586.66666666666, ans=0.125 2023-09-28 23:12:20,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:22,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 23:12:23,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 23:12:23,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 23:12:23,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:25,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:25,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:27,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:32,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=172653.33333333334, ans=0.1 2023-09-28 23:12:33,417 INFO [train.py:1039] (2/4) Epoch 5, batch 4650, loss[loss=0.2317, simple_loss=0.3037, pruned_loss=0.07983, over 24470.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3038, pruned_loss=0.09369, over 4689414.66 frames. ], batch size: 66, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:12:35,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=172653.33333333334, ans=0.125 2023-09-28 23:12:38,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:12:41,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:41,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:43,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:12:43,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:43,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:45,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:46,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=172653.33333333334, ans=0.0 2023-09-28 23:12:48,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 23:12:52,019 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:12:53,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:12:55,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 23:12:56,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:58,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 23:12:58,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:12:58,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 23:12:58,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 23:12:58,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:59,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:13:03,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:13:03,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:03,707 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 23:13:06,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:08,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 23:13:10,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:10,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:13:12,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 23:13:12,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=172786.66666666666, ans=0.125 2023-09-28 23:13:13,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:13:14,583 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.54 vs. limit=15.0 2023-09-28 23:13:18,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:13:18,620 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.50 vs. limit=15.0 2023-09-28 23:13:21,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:29,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:31,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:32,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:32,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:13:35,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 23:13:36,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 23:13:36,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 23:13:36,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 23:13:36,315 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:13:38,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=172920.0, ans=0.2 2023-09-28 23:13:39,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:45,873 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.231e+02 2.488e+02 2.964e+02 5.544e+02, threshold=4.977e+02, percent-clipped=2.0 2023-09-28 23:13:48,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:13:48,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:13:48,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 23:13:48,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:49,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:49,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:13:51,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:13:53,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:13:53,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:54,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:56,013 INFO [train.py:1039] (2/4) Epoch 5, batch 4700, loss[loss=0.2601, simple_loss=0.3084, pruned_loss=0.1059, over 23588.00 frames. ], tot_loss[loss=0.2454, simple_loss=0.3042, pruned_loss=0.09333, over 4708748.67 frames. ], batch size: 256, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:13:57,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=172986.66666666666, ans=0.0 2023-09-28 23:13:58,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:59,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:13:59,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:14:01,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:14:01,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:14:02,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 23:14:11,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:11,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:14:13,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:14:13,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:15,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:14:20,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 23:14:20,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 23:14:23,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:23,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:14:24,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:14:26,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:34,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:14:36,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:14:38,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=173120.0, ans=0.1 2023-09-28 23:14:40,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:46,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 23:14:48,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:14:51,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:14:54,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 23:14:55,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:14:59,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:15:01,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 23:15:01,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:02,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:05,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:15:06,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:15:08,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 23:15:08,139 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 23:15:08,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=173253.33333333334, ans=0.125 2023-09-28 23:15:11,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:12,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 23:15:14,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:19,206 INFO [train.py:1039] (2/4) Epoch 5, batch 4750, loss[loss=0.2192, simple_loss=0.2915, pruned_loss=0.0734, over 24503.00 frames. ], tot_loss[loss=0.2454, simple_loss=0.3045, pruned_loss=0.09315, over 4719899.88 frames. ], batch size: 66, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:15:19,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 23:15:21,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:15:23,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:15:28,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=173320.0, ans=0.025 2023-09-28 23:15:30,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 23:15:30,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:15:34,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 23:15:37,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:15:37,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:39,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:44,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 23:15:49,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:15:51,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 23:15:53,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:55,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:55,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:55,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:57,722 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 23:15:57,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 23:16:02,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 23:16:05,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:07,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:09,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:16:09,079 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 23:16:09,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:12,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:16:14,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:16:17,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 23:16:17,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 23:16:19,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:16:19,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:16:19,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:20,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:16:20,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 23:16:24,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 23:16:25,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:27,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:16:27,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 23:16:29,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:16:31,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:32,598 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.125e+02 2.370e+02 2.784e+02 4.798e+02, threshold=4.741e+02, percent-clipped=0.0 2023-09-28 23:16:32,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:16:34,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:34,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:16:39,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:39,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 23:16:41,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 23:16:42,590 INFO [train.py:1039] (2/4) Epoch 5, batch 4800, loss[loss=0.2777, simple_loss=0.3402, pruned_loss=0.1076, over 23846.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.3062, pruned_loss=0.09435, over 4719445.73 frames. ], batch size: 85, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:16:42,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 23:16:45,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:16:45,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:48,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 23:16:50,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=173653.33333333334, ans=0.125 2023-09-28 23:16:53,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:55,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:59,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:17:02,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:02,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:02,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 23:17:03,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:17:03,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:17:05,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:17:05,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=173720.0, ans=0.125 2023-09-28 23:17:11,634 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.90 vs. limit=15.0 2023-09-28 23:17:12,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:13,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:13,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:17:16,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:16,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:17:16,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:17,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:20,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:23,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:17:26,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:17:28,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:32,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 23:17:32,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 23:17:32,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:32,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:17:33,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:17:33,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:33,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:17:34,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:17:35,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:37,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:38,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=173853.33333333334, ans=0.95 2023-09-28 23:17:41,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:42,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:17:48,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 23:17:48,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:48,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:49,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:17:49,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:53,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:54,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:17:54,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:54,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:17:54,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:17:56,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:18:00,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:00,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:00,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:18:01,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 23:18:04,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 23:18:04,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:04,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:06,206 INFO [train.py:1039] (2/4) Epoch 5, batch 4850, loss[loss=0.2582, simple_loss=0.3027, pruned_loss=0.1069, over 23999.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.3072, pruned_loss=0.09507, over 4710889.31 frames. ], batch size: 196, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:18:06,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:06,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:08,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=173986.66666666666, ans=0.125 2023-09-28 23:18:09,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:18:19,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 23:18:21,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:27,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:29,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:18:29,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:32,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:32,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:18:34,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:18:34,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 23:18:39,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:42,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:18:42,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:18:44,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:18:44,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 23:18:46,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:46,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:49,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:49,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 23:18:50,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 23:18:53,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:19:01,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:19:02,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 23:19:02,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:19:02,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:19:03,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.04 vs. limit=22.5 2023-09-28 23:19:04,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:19:06,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 23:19:06,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:07,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 23:19:08,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:10,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:12,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 23:19:18,965 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.747e+02 2.375e+02 2.676e+02 3.229e+02 5.316e+02, threshold=5.352e+02, percent-clipped=3.0 2023-09-28 23:19:22,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:24,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=174253.33333333334, ans=0.125 2023-09-28 23:19:28,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:19:28,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:30,252 INFO [train.py:1039] (2/4) Epoch 5, batch 4900, loss[loss=0.254, simple_loss=0.2978, pruned_loss=0.1052, over 23742.00 frames. ], tot_loss[loss=0.2472, simple_loss=0.3058, pruned_loss=0.09436, over 4708152.63 frames. ], batch size: 179, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:19:30,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=174320.0, ans=0.0 2023-09-28 23:19:33,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 23:19:33,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:19:35,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=174320.0, ans=0.125 2023-09-28 23:19:39,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:40,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:42,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:19:43,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 23:19:49,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 23:19:54,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 23:19:54,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=174386.66666666666, ans=0.125 2023-09-28 23:19:55,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 23:19:55,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:19:55,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:55,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:19:55,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:55,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:19:57,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 23:20:01,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 23:20:01,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:20:03,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:20:05,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:20:05,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=174453.33333333334, ans=0.1 2023-09-28 23:20:08,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:20:08,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:09,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:09,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 23:20:11,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:20:12,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:20:12,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 23:20:12,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 23:20:17,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 23:20:19,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:20:21,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:20:22,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:20:22,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:22,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:20:22,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:20:22,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 23:20:24,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:27,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:20:30,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:20:32,126 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:20:34,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 23:20:36,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:20:37,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:20:37,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 23:20:44,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:44,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:20:44,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 23:20:44,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:20:46,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:20:46,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:52,919 INFO [train.py:1039] (2/4) Epoch 5, batch 4950, loss[loss=0.2167, simple_loss=0.2424, pruned_loss=0.09554, over 19162.00 frames. ], tot_loss[loss=0.246, simple_loss=0.3043, pruned_loss=0.09385, over 4701856.44 frames. ], batch size: 389, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:20:53,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:20:53,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:20:54,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:54,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 23:20:56,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:20:59,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:20:59,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:21:01,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 23:21:01,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 23:21:01,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:21:02,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 23:21:02,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:02,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:21:03,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:21:03,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:06,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:06,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:21:10,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:21:12,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:21:12,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:13,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:21:16,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:21:21,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:21,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=174720.0, ans=10.0 2023-09-28 23:21:25,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:21:26,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:28,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:28,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:21:31,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 23:21:31,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 23:21:33,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:34,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:21:34,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:21:34,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=174786.66666666666, ans=0.0 2023-09-28 23:21:37,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:21:37,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:21:37,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:21:41,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:43,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:21:44,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=174853.33333333334, ans=0.0 2023-09-28 23:21:44,594 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.60 vs. limit=15.0 2023-09-28 23:21:45,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:21:46,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:48,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:48,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 23:21:49,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:21:51,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:21:54,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=174853.33333333334, ans=0.125 2023-09-28 23:21:55,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:21:56,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:21:56,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:21:56,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:57,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:21:57,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:21:59,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:21:59,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:22:01,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:22:01,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=174920.0, ans=0.1 2023-09-28 23:22:02,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 23:22:03,589 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.40 vs. limit=22.5 2023-09-28 23:22:05,863 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.824e+02 2.215e+02 2.550e+02 3.115e+02 4.856e+02, threshold=5.099e+02, percent-clipped=0.0 2023-09-28 23:22:07,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:12,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 23:22:12,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:22:16,421 INFO [train.py:1039] (2/4) Epoch 5, batch 5000, loss[loss=0.2512, simple_loss=0.308, pruned_loss=0.0972, over 23576.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.304, pruned_loss=0.09386, over 4700660.08 frames. ], batch size: 149, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:22:18,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=174986.66666666666, ans=0.125 2023-09-28 23:22:18,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=174986.66666666666, ans=0.125 2023-09-28 23:22:21,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:22:21,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:22,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 23:22:23,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 23:22:26,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:22:28,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 23:22:29,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:22:29,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:22:29,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 23:22:31,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:31,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:22:33,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 23:22:33,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:33,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:22:34,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 23:22:35,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 23:22:36,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:22:36,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 23:22:36,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:22:37,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.31 vs. limit=15.0 2023-09-28 23:22:38,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:38,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:22:38,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 23:22:38,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 23:22:39,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 23:22:39,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:41,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:42,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 23:22:42,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:45,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:47,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:48,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:22:50,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 23:22:50,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:22:52,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:22:57,009 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 23:23:00,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:23:01,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:23:01,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:04,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 23:23:04,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:23:04,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:05,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:06,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 23:23:08,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:08,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=175186.66666666666, ans=0.125 2023-09-28 23:23:11,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:13,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:19,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 23:23:25,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:34,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:36,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:36,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:23:36,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:37,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:23:37,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:23:37,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:39,237 INFO [train.py:1039] (2/4) Epoch 5, batch 5050, loss[loss=0.2391, simple_loss=0.3117, pruned_loss=0.08325, over 23759.00 frames. ], tot_loss[loss=0.2472, simple_loss=0.3053, pruned_loss=0.0946, over 4691697.92 frames. ], batch size: 85, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:23:42,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:44,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 23:23:44,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:23:47,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:49,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:23:49,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 23:23:50,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:50,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:53,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:23:55,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:23:55,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:24:01,662 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:24:04,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 23:24:06,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:24:08,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:08,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 23:24:09,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:11,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:11,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:24:11,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:24:11,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 23:24:12,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 23:24:14,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:17,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:21,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:21,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 23:24:22,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:25,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 23:24:27,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:24:29,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:24:29,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:31,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:32,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:24:34,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:24:35,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:35,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:24:35,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:24:36,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 23:24:37,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:24:39,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:42,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:42,645 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 23:24:42,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:24:46,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:24:46,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=175586.66666666666, ans=0.0 2023-09-28 23:24:47,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:47,455 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 23:24:50,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:50,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 23:24:50,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:50,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=175586.66666666666, ans=0.0 2023-09-28 23:24:51,914 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.101e+02 2.649e+02 3.047e+02 4.508e+02, threshold=5.297e+02, percent-clipped=0.0 2023-09-28 23:24:55,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:56,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=175586.66666666666, ans=0.125 2023-09-28 23:24:57,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:57,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 23:24:57,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 23:25:00,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:00,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:01,953 INFO [train.py:1039] (2/4) Epoch 5, batch 5100, loss[loss=0.2409, simple_loss=0.3056, pruned_loss=0.08808, over 24351.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.306, pruned_loss=0.09449, over 4705085.96 frames. ], batch size: 61, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:25:02,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:25:04,254 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 23:25:07,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:25:08,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=175653.33333333334, ans=0.1 2023-09-28 23:25:09,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 23:25:11,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 23:25:11,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:12,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:25:13,485 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.55 vs. limit=15.0 2023-09-28 23:25:15,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:25:17,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 23:25:17,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 23:25:20,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:25:22,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:25:25,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:29,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 23:25:30,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:32,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:25:32,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:25:35,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 23:25:39,765 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 23:25:42,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:42,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 23:25:42,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 23:25:47,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:56,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:25:59,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 23:26:00,040 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 23:26:01,436 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 23:26:03,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 23:26:03,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:26:05,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 23:26:09,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 23:26:11,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:26:13,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:26:15,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 23:26:18,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:26:18,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 23:26:24,732 INFO [train.py:1039] (2/4) Epoch 5, batch 5150, loss[loss=0.2638, simple_loss=0.3184, pruned_loss=0.1046, over 23650.00 frames. ], tot_loss[loss=0.2491, simple_loss=0.3076, pruned_loss=0.09527, over 4708145.64 frames. ], batch size: 256, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:26:24,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:26:24,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:26:24,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:26:25,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:26:25,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:26:26,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:26:26,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 23:26:26,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 23:26:28,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 23:26:28,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:26:28,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 23:26:30,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:31,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:26:33,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:35,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:40,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:26:40,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 23:26:41,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:41,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:26:42,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=176053.33333333334, ans=0.0 2023-09-28 23:26:43,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=176053.33333333334, ans=0.125 2023-09-28 23:26:44,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:26:44,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:26:44,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:26:45,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=176053.33333333334, ans=0.2 2023-09-28 23:26:46,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:26:46,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:26:46,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 23:26:47,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:26:48,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:26:50,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:26:52,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 23:26:53,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:26:59,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:27:01,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 23:27:05,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:27:07,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=176120.0, ans=0.125 2023-09-28 23:27:11,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:11,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=176120.0, ans=0.125 2023-09-28 23:27:13,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:16,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:16,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:19,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 23:27:24,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:27:26,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:27:26,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:27:29,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=176253.33333333334, ans=0.125 2023-09-28 23:27:30,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:31,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:32,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 23:27:34,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=176253.33333333334, ans=0.125 2023-09-28 23:27:37,268 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.252e+02 2.481e+02 2.857e+02 3.938e+02, threshold=4.962e+02, percent-clipped=0.0 2023-09-28 23:27:37,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:40,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:27:41,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=176253.33333333334, ans=0.125 2023-09-28 23:27:42,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:42,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:27:44,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:27:44,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:27:44,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:27:44,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:27:47,090 INFO [train.py:1039] (2/4) Epoch 5, batch 5200, loss[loss=0.2722, simple_loss=0.3354, pruned_loss=0.1045, over 24361.00 frames. ], tot_loss[loss=0.25, simple_loss=0.3086, pruned_loss=0.09564, over 4709579.45 frames. ], batch size: 77, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:27:49,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:27:51,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:27:55,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:00,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=176320.0, ans=0.125 2023-09-28 23:28:01,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 23:28:01,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:28:02,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:03,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:04,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:28:06,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:06,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=176386.66666666666, ans=0.125 2023-09-28 23:28:07,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 23:28:09,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:28:09,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:12,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 23:28:16,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:28:17,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:28:17,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 23:28:17,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 23:28:18,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.43 vs. limit=15.0 2023-09-28 23:28:22,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 23:28:22,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:22,351 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 23:28:22,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:24,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=176453.33333333334, ans=0.1 2023-09-28 23:28:25,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:25,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:28:27,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 23:28:27,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:28:30,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:33,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=176453.33333333334, ans=0.125 2023-09-28 23:28:34,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 23:28:34,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 23:28:34,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 23:28:40,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 23:28:40,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:28:47,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:28:47,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:28:48,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 23:28:48,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:48,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:28:48,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:50,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:28:54,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:28:55,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:28:58,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:58,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=176586.66666666666, ans=0.125 2023-09-28 23:29:00,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:00,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:07,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:07,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 23:29:08,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:29:08,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:29:10,856 INFO [train.py:1039] (2/4) Epoch 5, batch 5250, loss[loss=0.2196, simple_loss=0.2989, pruned_loss=0.07019, over 24302.00 frames. ], tot_loss[loss=0.2485, simple_loss=0.3076, pruned_loss=0.09465, over 4720057.13 frames. ], batch size: 74, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:29:10,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:11,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:29:11,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:29:13,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=176653.33333333334, ans=0.0 2023-09-28 23:29:14,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:29:17,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:17,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:29:18,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:29:23,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:25,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=176720.0, ans=0.0 2023-09-28 23:29:27,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:29:28,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:29:29,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:29:31,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 23:29:31,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:34,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:39,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=176720.0, ans=0.125 2023-09-28 23:29:42,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=176786.66666666666, ans=0.1 2023-09-28 23:29:52,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=176786.66666666666, ans=0.125 2023-09-28 23:29:52,660 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=24.20 vs. limit=15.0 2023-09-28 23:30:16,423 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.340e+02 2.624e+02 3.163e+02 5.259e+02, threshold=5.248e+02, percent-clipped=2.0 2023-09-28 23:30:19,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=176920.0, ans=0.125 2023-09-28 23:30:24,702 INFO [train.py:1039] (2/4) Epoch 5, batch 5300, loss[loss=0.2387, simple_loss=0.2956, pruned_loss=0.09088, over 13496.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.3049, pruned_loss=0.09409, over 4694367.41 frames. ], batch size: 28, lr: 1.90e-02, grad_scale: 32.0 2023-09-28 23:30:40,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:30:40,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 23:30:40,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 23:30:40,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:40,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:41,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:41,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:41,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:41,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:30:41,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:41,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:30:41,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:30:41,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 23:30:42,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 23:30:42,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 23:30:42,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:30:42,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 23:30:42,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 23:30:42,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:43,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:43,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:43,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:43,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:30:44,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:44,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:44,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:44,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:44,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:44,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:30:44,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:44,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:30:45,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 23:30:45,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:46,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:46,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 23:30:46,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 23:30:46,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:30:46,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:30:46,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 23:30:46,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 23:30:46,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:47,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:30:48,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:48,171 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 23:30:48,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 23:30:48,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:30:48,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:48,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 23:30:48,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 23:30:48,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 23:30:49,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:57,131 INFO [train.py:1039] (2/4) Epoch 6, batch 0, loss[loss=0.2357, simple_loss=0.3046, pruned_loss=0.08346, over 24656.00 frames. ], tot_loss[loss=0.2357, simple_loss=0.3046, pruned_loss=0.08346, over 24656.00 frames. ], batch size: 65, lr: 1.78e-02, grad_scale: 32.0 2023-09-28 23:30:57,132 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-28 23:31:12,847 INFO [train.py:1071] (2/4) Epoch 6, validation: loss=0.2892, simple_loss=0.2993, pruned_loss=0.1395, over 1125622.00 frames. 2023-09-28 23:31:12,848 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-28 23:31:16,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 23:31:16,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:31:18,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:31:18,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=177066.66666666666, ans=0.125 2023-09-28 23:31:24,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:24,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:31:24,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:24,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 23:31:26,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 23:31:27,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:29,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:31,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.23 vs. limit=22.5 2023-09-28 23:31:32,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:32,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:34,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:31:34,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:35,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 23:31:38,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:45,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:31:45,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:49,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 23:31:53,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:31:53,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:31:55,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:31:55,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=177200.0, ans=0.1 2023-09-28 23:31:59,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.28 vs. limit=15.0 2023-09-28 23:32:00,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:32:04,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:32:04,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=177266.66666666666, ans=0.1 2023-09-28 23:32:10,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 23:32:12,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 23:32:13,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:13,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:15,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:32:17,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:32:18,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 23:32:19,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:19,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=177333.33333333334, ans=0.0 2023-09-28 23:32:23,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:24,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=177333.33333333334, ans=0.125 2023-09-28 23:32:27,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:32:31,997 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 23:32:33,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:32:34,992 INFO [train.py:1039] (2/4) Epoch 6, batch 50, loss[loss=0.2159, simple_loss=0.2736, pruned_loss=0.0791, over 24343.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.3044, pruned_loss=0.09245, over 1065123.59 frames. ], batch size: 56, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:32:38,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:38,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=177400.0, ans=0.125 2023-09-28 23:32:41,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:41,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 23:32:41,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:32:42,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:32:44,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:46,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:49,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:49,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=177466.66666666666, ans=0.1 2023-09-28 23:32:55,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 23:32:55,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:02,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:33:04,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 23:33:06,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 23:33:08,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:33:08,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:09,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.62 vs. limit=15.0 2023-09-28 23:33:09,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:10,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=177533.33333333334, ans=0.035 2023-09-28 23:33:11,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:11,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:33:13,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:33:13,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:13,639 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.04 vs. limit=15.0 2023-09-28 23:33:18,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=177533.33333333334, ans=0.1 2023-09-28 23:33:21,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:22,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:33:23,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:33:25,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 23:33:26,831 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.186e+02 2.592e+02 3.142e+02 7.850e+02, threshold=5.184e+02, percent-clipped=2.0 2023-09-28 23:33:27,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:33:28,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:33:28,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 23:33:28,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:29,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:30,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 23:33:39,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:33:39,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:41,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:43,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:43,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:47,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 23:33:47,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 23:33:47,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:48,025 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.47 vs. limit=15.0 2023-09-28 23:33:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:50,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:50,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:51,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 23:33:51,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 23:33:54,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:33:54,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:54,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:33:56,336 INFO [train.py:1039] (2/4) Epoch 6, batch 100, loss[loss=0.2424, simple_loss=0.3016, pruned_loss=0.09162, over 23645.00 frames. ], tot_loss[loss=0.2428, simple_loss=0.3037, pruned_loss=0.09096, over 1866534.18 frames. ], batch size: 135, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:33:56,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 23:33:56,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 23:33:58,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:58,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:00,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:34:01,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:34:02,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:34:05,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:34:08,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:11,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 23:34:12,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:34:17,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:34:17,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:17,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:17,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:34:19,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:19,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 23:34:21,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:34:21,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:21,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:21,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:25,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 23:34:25,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:27,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:28,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.18 vs. limit=22.5 2023-09-28 23:34:28,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:34:29,796 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.34 vs. limit=22.5 2023-09-28 23:34:30,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:34:34,186 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 23:34:34,210 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 23:34:37,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:34:37,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:34:40,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:34:42,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:43,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:51,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:52,757 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 23:34:54,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:34:58,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:34:59,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:01,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:04,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=178000.0, ans=0.0 2023-09-28 23:35:05,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:07,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:08,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:35:10,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:11,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:14,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:15,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:35:15,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:16,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 23:35:16,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=12.0 2023-09-28 23:35:18,629 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 23:35:18,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:18,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:35:20,652 INFO [train.py:1039] (2/4) Epoch 6, batch 150, loss[loss=0.2308, simple_loss=0.2847, pruned_loss=0.08841, over 23640.00 frames. ], tot_loss[loss=0.2417, simple_loss=0.3029, pruned_loss=0.0903, over 2509624.29 frames. ], batch size: 135, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:35:20,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:20,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:20,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:35:21,565 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.14 vs. limit=15.0 2023-09-28 23:35:22,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:35:22,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:35:22,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:22,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:24,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:24,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:35:24,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:35:27,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:31,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:31,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:35:31,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:33,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=178066.66666666666, ans=0.0 2023-09-28 23:35:33,684 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.96 vs. limit=22.5 2023-09-28 23:35:34,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:35,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:37,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:35:39,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:42,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 23:35:42,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 23:35:42,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 23:35:47,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:35:47,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:35:47,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:48,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:48,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:50,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:50,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:54,350 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 23:35:56,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:01,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:04,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:36:06,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 23:36:09,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:36:09,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:10,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:12,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:36:13,833 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.160e+02 2.435e+02 3.119e+02 4.742e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-28 23:36:15,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:36:15,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:36:16,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:17,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 23:36:23,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:25,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:25,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:36:25,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:36:28,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:31,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 23:36:33,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:36:34,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:36:36,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:38,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:36:38,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 23:36:38,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:38,556 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 23:36:38,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=178333.33333333334, ans=0.0 2023-09-28 23:36:42,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:43,673 INFO [train.py:1039] (2/4) Epoch 6, batch 200, loss[loss=0.2431, simple_loss=0.3042, pruned_loss=0.09098, over 24463.00 frames. ], tot_loss[loss=0.2418, simple_loss=0.3032, pruned_loss=0.09019, over 3013011.91 frames. ], batch size: 63, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:36:46,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:36:46,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:36:48,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 23:36:50,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:50,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:51,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 23:36:53,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:36:54,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:56,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:01,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:37:01,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:37:01,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:01,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=178466.66666666666, ans=0.125 2023-09-28 23:37:06,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=178466.66666666666, ans=0.125 2023-09-28 23:37:24,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:37:24,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:37:24,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:37:25,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:37:26,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 23:37:26,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:37:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:30,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:37:32,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:32,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:37:32,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 23:37:33,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:37:33,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:38,522 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.11 vs. limit=10.0 2023-09-28 23:37:39,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:37:41,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=178600.0, ans=0.1 2023-09-28 23:37:42,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=178600.0, ans=0.5 2023-09-28 23:37:44,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:51,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:53,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:37:59,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:02,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 23:38:02,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:02,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:38:02,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:04,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:38:04,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=178733.33333333334, ans=0.0 2023-09-28 23:38:05,646 INFO [train.py:1039] (2/4) Epoch 6, batch 250, loss[loss=0.2357, simple_loss=0.3082, pruned_loss=0.08163, over 24680.00 frames. ], tot_loss[loss=0.2427, simple_loss=0.304, pruned_loss=0.09068, over 3381055.97 frames. ], batch size: 73, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:38:07,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 23:38:07,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:38:07,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.44 vs. limit=15.0 2023-09-28 23:38:08,679 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 23:38:10,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:12,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:38:12,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:12,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:17,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:38:18,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:19,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:38:23,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=178800.0, ans=0.025 2023-09-28 23:38:24,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:38:26,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=178800.0, ans=0.0 2023-09-28 23:38:31,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=178800.0, ans=0.125 2023-09-28 23:38:35,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:37,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:39,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:38:46,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:38:46,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:38:47,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:38:47,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:49,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:38:49,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:38:49,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=178866.66666666666, ans=0.2 2023-09-28 23:38:51,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:52,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:38:56,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 23:38:56,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:57,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:38:59,148 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.163e+02 2.470e+02 2.985e+02 4.206e+02, threshold=4.941e+02, percent-clipped=0.0 2023-09-28 23:38:59,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:38:59,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:38:59,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=178933.33333333334, ans=0.125 2023-09-28 23:39:01,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:02,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:39:02,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:39:03,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=178933.33333333334, ans=0.125 2023-09-28 23:39:04,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:06,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:39:06,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:09,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:39:12,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:14,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:39:19,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=179000.0, ans=0.025 2023-09-28 23:39:19,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=179000.0, ans=0.0 2023-09-28 23:39:21,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:23,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:39:26,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 23:39:28,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:39:28,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:29,810 INFO [train.py:1039] (2/4) Epoch 6, batch 300, loss[loss=0.25, simple_loss=0.294, pruned_loss=0.103, over 23800.00 frames. ], tot_loss[loss=0.2414, simple_loss=0.3025, pruned_loss=0.09017, over 3670068.28 frames. ], batch size: 164, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:39:30,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 23:39:30,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:39:31,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:39:31,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 23:39:36,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:38,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:39:40,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:39:41,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 23:39:41,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:43,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=179066.66666666666, ans=0.0 2023-09-28 23:39:44,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:39:44,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 23:39:44,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:39:46,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=179133.33333333334, ans=0.0 2023-09-28 23:39:49,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:39:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:39:54,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 23:39:58,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 23:39:58,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:02,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:03,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:03,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 23:40:03,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:40:06,191 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.47 vs. limit=15.0 2023-09-28 23:40:06,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:40:06,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=179200.0, ans=0.125 2023-09-28 23:40:10,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:40:10,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:14,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:40:14,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 23:40:16,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:40:17,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:19,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 23:40:21,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:23,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-09-28 23:40:24,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:40:28,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:40:28,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 23:40:31,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:31,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:40:35,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:36,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:40:38,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 23:40:38,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:40:39,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:40,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 23:40:40,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=179333.33333333334, ans=0.125 2023-09-28 23:40:43,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:43,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:44,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=179333.33333333334, ans=0.125 2023-09-28 23:40:45,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:45,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=179333.33333333334, ans=0.2 2023-09-28 23:40:46,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:46,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:52,683 INFO [train.py:1039] (2/4) Epoch 6, batch 350, loss[loss=0.2313, simple_loss=0.3061, pruned_loss=0.07819, over 24418.00 frames. ], tot_loss[loss=0.2401, simple_loss=0.3013, pruned_loss=0.08945, over 3906263.33 frames. ], batch size: 69, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:40:52,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:40:52,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:40:55,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:57,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=179400.0, ans=0.125 2023-09-28 23:41:03,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:41:07,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:07,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:10,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 23:41:10,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:11,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 23:41:14,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:16,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 23:41:16,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:21,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 23:41:21,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:41:24,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:24,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:41:24,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=179533.33333333334, ans=0.125 2023-09-28 23:41:25,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:25,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:27,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:41:27,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:28,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:41:30,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:41:30,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:39,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:41:39,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:41:40,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:41:40,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:45,461 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.204e+02 2.490e+02 2.803e+02 5.345e+02, threshold=4.981e+02, percent-clipped=1.0 2023-09-28 23:41:47,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 23:41:47,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:54,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:54,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:41:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:56,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 23:41:56,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=179600.0, ans=0.0 2023-09-28 23:41:57,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:41:59,173 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 23:42:00,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 23:42:00,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:03,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:42:03,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 23:42:05,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:09,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:42:09,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:12,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:12,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:12,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=179666.66666666666, ans=0.125 2023-09-28 23:42:15,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.17 vs. limit=15.0 2023-09-28 23:42:15,711 INFO [train.py:1039] (2/4) Epoch 6, batch 400, loss[loss=0.2101, simple_loss=0.2761, pruned_loss=0.07207, over 24240.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.3009, pruned_loss=0.08897, over 4102491.02 frames. ], batch size: 56, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:42:15,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:17,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:42:19,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=179733.33333333334, ans=0.1 2023-09-28 23:42:21,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:42:21,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 23:42:21,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:23,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:25,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:42:27,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:29,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:29,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:31,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=179800.0, ans=0.125 2023-09-28 23:42:32,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 23:42:33,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 23:42:33,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:35,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 23:42:35,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:39,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:42:39,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:40,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 23:42:41,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:42:41,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:41,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:41,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:43,213 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 23:42:43,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 23:42:50,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:50,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:52,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 23:42:53,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 23:42:54,629 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.00 vs. limit=15.0 2023-09-28 23:42:57,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:42:59,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:05,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 23:43:08,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:43:10,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 23:43:12,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:43:14,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:43:15,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 23:43:19,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:43:19,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=179933.33333333334, ans=0.125 2023-09-28 23:43:21,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:43:23,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:43:24,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:24,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 23:43:29,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:43:31,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 23:43:32,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:43:32,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:43:36,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 23:43:38,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:43:38,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:43:38,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:43:39,730 INFO [train.py:1039] (2/4) Epoch 6, batch 450, loss[loss=0.2498, simple_loss=0.3196, pruned_loss=0.08999, over 24582.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3018, pruned_loss=0.08944, over 4240637.56 frames. ], batch size: 71, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:43:41,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 23:43:41,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:43:41,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:43:42,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:43:42,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 23:43:43,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:43:44,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:43:46,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:43:47,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=180066.66666666666, ans=0.1 2023-09-28 23:43:59,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:59,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:01,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 23:44:02,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 23:44:06,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:44:10,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:12,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:17,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:17,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:19,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 23:44:20,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 23:44:22,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 23:44:22,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:44:23,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:25,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:44:28,909 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 23:44:28,922 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 23:44:28,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:30,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:44:31,953 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.131e+02 2.453e+02 2.864e+02 4.653e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-28 23:44:32,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:44:37,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:44:37,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:44:38,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:44:39,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 23:44:42,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:44,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:44:45,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:44:45,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 23:44:46,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=180333.33333333334, ans=0.1 2023-09-28 23:44:50,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:44:50,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 23:44:52,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 23:44:53,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:55,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=180333.33333333334, ans=0.125 2023-09-28 23:44:57,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:44:59,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:00,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:45:00,185 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 23:45:02,014 INFO [train.py:1039] (2/4) Epoch 6, batch 500, loss[loss=0.2349, simple_loss=0.2952, pruned_loss=0.0873, over 23606.00 frames. ], tot_loss[loss=0.2411, simple_loss=0.3023, pruned_loss=0.08988, over 4352771.91 frames. ], batch size: 149, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:45:05,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:06,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:45:06,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:06,858 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 23:45:08,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=180400.0, ans=0.0 2023-09-28 23:45:09,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 23:45:09,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:13,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:45:17,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:45:17,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:45:20,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:20,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:20,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:31,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:33,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:45:33,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:45:33,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:33,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 23:45:35,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:45:38,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:45:40,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:45:40,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:45:40,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:40,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 23:45:43,404 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 23:45:47,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:45:48,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:48,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:45:52,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 23:45:54,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=180600.0, ans=0.0 2023-09-28 23:45:55,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:45:57,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:01,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:06,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:46:14,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:16,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 23:46:16,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:16,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:17,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 23:46:19,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:46:19,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:24,612 INFO [train.py:1039] (2/4) Epoch 6, batch 550, loss[loss=0.2767, simple_loss=0.3186, pruned_loss=0.1174, over 23665.00 frames. ], tot_loss[loss=0.2412, simple_loss=0.3027, pruned_loss=0.08991, over 4434508.93 frames. ], batch size: 232, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:46:28,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 23:46:28,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 23:46:28,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:29,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 23:46:30,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:46:30,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:30,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:32,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:33,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:46:33,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:46:36,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:37,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=180733.33333333334, ans=0.2 2023-09-28 23:46:38,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 23:46:38,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:46:43,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:46:43,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:45,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:46:47,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:47,956 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.69 vs. limit=15.0 2023-09-28 23:46:51,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 23:46:51,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 23:46:53,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:47:00,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:47:00,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:02,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:47:05,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:05,211 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 23:47:06,049 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-28 23:47:07,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:47:08,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:47:13,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:13,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:47:13,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:47:13,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:15,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 23:47:17,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 23:47:17,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=180933.33333333334, ans=0.0 2023-09-28 23:47:18,451 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.230e+02 2.579e+02 3.045e+02 5.000e+02, threshold=5.158e+02, percent-clipped=1.0 2023-09-28 23:47:18,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:18,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:47:20,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:47:20,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:47:23,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:47:24,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:47:26,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:47:27,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:29,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:47:29,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:47:33,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:34,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:47:34,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:36,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:47:36,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:47:41,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=181000.0, ans=0.125 2023-09-28 23:47:44,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 23:47:45,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.60 vs. limit=15.0 2023-09-28 23:47:47,846 INFO [train.py:1039] (2/4) Epoch 6, batch 600, loss[loss=0.2413, simple_loss=0.2886, pruned_loss=0.09701, over 23703.00 frames. ], tot_loss[loss=0.2412, simple_loss=0.3024, pruned_loss=0.09003, over 4499450.35 frames. ], batch size: 232, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:47:49,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 23:47:50,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.92 vs. limit=10.0 2023-09-28 23:47:51,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:47:51,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:47:51,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:57,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:47:59,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:48:01,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 23:48:03,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:48:06,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:07,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:09,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 23:48:09,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:48:12,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=181133.33333333334, ans=0.0 2023-09-28 23:48:17,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 23:48:21,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:48:21,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:21,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:48:29,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:48:29,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:48:29,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:30,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.78 vs. limit=15.0 2023-09-28 23:48:34,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=181200.0, ans=0.035 2023-09-28 23:48:38,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:48:41,764 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.04 vs. limit=22.5 2023-09-28 23:48:42,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:42,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:42,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:48,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=181266.66666666666, ans=0.125 2023-09-28 23:48:50,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=181266.66666666666, ans=0.125 2023-09-28 23:48:51,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 23:48:56,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:48:56,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:49:03,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 23:49:03,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:49:06,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 23:49:06,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:49:06,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:49:10,994 INFO [train.py:1039] (2/4) Epoch 6, batch 650, loss[loss=0.2498, simple_loss=0.2888, pruned_loss=0.1054, over 23598.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.3014, pruned_loss=0.08869, over 4564152.32 frames. ], batch size: 256, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:49:13,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:49:13,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=181400.0, ans=0.125 2023-09-28 23:49:14,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:49:16,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:49:16,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:49:16,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=181400.0, ans=0.125 2023-09-28 23:49:19,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:23,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 23:49:23,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:49:30,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:49:30,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:35,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:38,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 23:49:39,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:49:40,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:40,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.whiten.whitening_limit, batch_count=181466.66666666666, ans=12.0 2023-09-28 23:49:43,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:49:45,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 23:49:45,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=181533.33333333334, ans=0.0 2023-09-28 23:49:46,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=16.48 vs. limit=15.0 2023-09-28 23:49:47,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:48,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:48,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:49:50,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:51,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:49:52,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:49:54,006 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 23:49:54,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:54,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:49:57,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:58,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:49:58,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:49:58,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:50:00,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 23:50:01,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:50:01,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:50:05,131 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.251e+02 2.578e+02 2.975e+02 4.088e+02, threshold=5.156e+02, percent-clipped=0.0 2023-09-28 23:50:05,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:50:05,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:50:05,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:50:08,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 23:50:09,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 23:50:09,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:09,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:50:09,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:50:09,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:50:10,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:50:18,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:18,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:50:21,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:50:24,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:24,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 23:50:25,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:27,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=181666.66666666666, ans=0.125 2023-09-28 23:50:33,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:50:33,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:33,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:34,445 INFO [train.py:1039] (2/4) Epoch 6, batch 700, loss[loss=0.2104, simple_loss=0.2799, pruned_loss=0.07041, over 22145.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.2993, pruned_loss=0.08859, over 4575887.77 frames. ], batch size: 48, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:50:34,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:37,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 23:50:37,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 23:50:41,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 23:50:43,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:45,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:50:48,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 23:50:51,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:55,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:50:57,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:58,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:50:58,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:51:00,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=181800.0, ans=0.125 2023-09-28 23:51:02,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:51:05,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 23:51:05,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:51:08,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 23:51:12,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 23:51:15,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:51:15,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:51:17,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:51:22,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:51:22,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 23:51:29,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:29,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:51:29,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 23:51:34,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:51:34,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:36,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=181933.33333333334, ans=0.1 2023-09-28 23:51:37,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:51:42,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=182000.0, ans=0.125 2023-09-28 23:51:44,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:51:44,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 23:51:47,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 23:51:47,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 23:51:48,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-09-28 23:51:51,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:52,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:51:54,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:51:56,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:56,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 23:51:57,417 INFO [train.py:1039] (2/4) Epoch 6, batch 750, loss[loss=0.2577, simple_loss=0.3288, pruned_loss=0.09325, over 24452.00 frames. ], tot_loss[loss=0.2379, simple_loss=0.2989, pruned_loss=0.08841, over 4618139.42 frames. ], batch size: 69, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:52:02,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 23:52:02,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 23:52:03,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 23:52:03,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 23:52:05,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 23:52:05,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:52:07,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 23:52:09,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:52:09,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:12,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:14,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:15,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:52:15,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:17,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:52:18,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:52:21,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:52:22,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=182133.33333333334, ans=0.1 2023-09-28 23:52:24,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:25,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:25,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 23:52:26,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:52:28,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:30,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:32,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:52:32,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 23:52:32,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:52:35,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 23:52:35,508 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 23:52:36,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 23:52:37,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:52:37,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:52:39,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:52:44,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:44,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:52:44,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:52:47,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:49,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:51,234 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.788e+02 2.168e+02 2.495e+02 2.811e+02 4.815e+02, threshold=4.990e+02, percent-clipped=0.0 2023-09-28 23:52:51,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 23:52:51,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:52:52,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:52:53,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:52:56,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:52:56,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 23:52:57,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:03,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.89 vs. limit=22.5 2023-09-28 23:53:05,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:05,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:53:07,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:09,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=182333.33333333334, ans=0.1 2023-09-28 23:53:10,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:53:14,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 23:53:14,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:14,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:16,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:17,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=182333.33333333334, ans=0.07 2023-09-28 23:53:18,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:21,138 INFO [train.py:1039] (2/4) Epoch 6, batch 800, loss[loss=0.2657, simple_loss=0.305, pruned_loss=0.1132, over 19393.00 frames. ], tot_loss[loss=0.2377, simple_loss=0.2992, pruned_loss=0.08808, over 4643985.49 frames. ], batch size: 388, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:53:21,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:21,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:53:29,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:29,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:31,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:31,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:34,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:34,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:35,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:40,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:40,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:53:44,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 23:53:44,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:45,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:47,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:53:47,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:47,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 23:53:47,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:47,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 23:53:49,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:51,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:54,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:54,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:56,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:58,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:58,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=182533.33333333334, ans=0.0 2023-09-28 23:54:02,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:54:03,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=182533.33333333334, ans=0.125 2023-09-28 23:54:04,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:54:04,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:54:07,676 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 23:54:09,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 23:54:09,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:54:09,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:12,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:12,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:12,871 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.54 vs. limit=15.0 2023-09-28 23:54:18,757 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 23:54:18,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 23:54:21,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:54:23,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:54:27,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:54:30,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:54:31,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 23:54:33,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:54:35,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.87 vs. limit=15.0 2023-09-28 23:54:37,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 23:54:39,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=182666.66666666666, ans=0.1 2023-09-28 23:54:42,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:54:43,407 INFO [train.py:1039] (2/4) Epoch 6, batch 850, loss[loss=0.2678, simple_loss=0.3282, pruned_loss=0.1037, over 24056.00 frames. ], tot_loss[loss=0.2391, simple_loss=0.3002, pruned_loss=0.08902, over 4667895.06 frames. ], batch size: 80, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:54:45,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:54:45,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 23:54:45,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:54:48,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:48,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 23:54:48,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:49,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:54:52,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:54:53,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:54:54,656 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.57 vs. limit=5.0 2023-09-28 23:54:55,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:56,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 23:54:56,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 23:54:58,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 23:54:59,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:55:00,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:02,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:03,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:03,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:55:08,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:08,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:10,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 23:55:11,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 23:55:14,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:16,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 23:55:16,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=182866.66666666666, ans=0.2 2023-09-28 23:55:20,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 23:55:22,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 23:55:24,275 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 23:55:25,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:25,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:55:25,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:55:27,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:29,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:29,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 23:55:33,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:33,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:35,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:55:36,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:55:37,985 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.156e+02 2.378e+02 2.757e+02 3.805e+02, threshold=4.755e+02, percent-clipped=0.0 2023-09-28 23:55:38,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:55:39,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:55:39,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 23:55:42,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:55:42,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:55:45,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:55:45,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:46,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:48,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:50,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:55:52,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:55:53,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:53,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:56:02,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:56:03,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:56:04,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.88 vs. limit=10.0 2023-09-28 23:56:05,004 INFO [train.py:1039] (2/4) Epoch 6, batch 900, loss[loss=0.2266, simple_loss=0.2948, pruned_loss=0.07922, over 24634.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3016, pruned_loss=0.08954, over 4693346.87 frames. ], batch size: 65, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:56:05,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 23:56:05,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:05,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:56:07,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 23:56:13,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:56:14,270 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.37 vs. limit=22.5 2023-09-28 23:56:18,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:18,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=183066.66666666666, ans=0.0 2023-09-28 23:56:20,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 23:56:21,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:56:22,296 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.33 vs. limit=12.0 2023-09-28 23:56:23,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 23:56:23,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:56:23,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=183133.33333333334, ans=0.1 2023-09-28 23:56:23,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=183133.33333333334, ans=0.1 2023-09-28 23:56:24,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:24,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:25,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:56:25,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:56:28,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=183133.33333333334, ans=0.1 2023-09-28 23:56:38,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:56:38,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:38,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:56:40,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=183200.0, ans=0.0 2023-09-28 23:56:42,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:47,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 23:56:47,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:56:52,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:56:52,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:56:52,984 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 23:56:54,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 23:57:00,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:57:00,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:57:01,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:57:09,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:09,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:11,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 23:57:11,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:57:14,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 23:57:15,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:57:15,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:17,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:57:19,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:23,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 23:57:23,228 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 23:57:24,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:57:24,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 23:57:27,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=183400.0, ans=0.1 2023-09-28 23:57:29,115 INFO [train.py:1039] (2/4) Epoch 6, batch 950, loss[loss=0.2145, simple_loss=0.2843, pruned_loss=0.07235, over 24668.00 frames. ], tot_loss[loss=0.2415, simple_loss=0.3026, pruned_loss=0.09022, over 4701715.86 frames. ], batch size: 65, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:57:29,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:32,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 23:57:38,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:39,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:39,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:41,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:57:42,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=183400.0, ans=0.125 2023-09-28 23:57:43,115 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 23:57:46,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:48,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:57:50,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:50,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:57:50,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 23:57:50,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:57:53,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:55,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 23:57:55,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:59,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:58:00,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 23:58:01,519 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.35 vs. limit=12.0 2023-09-28 23:58:02,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:58:05,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:58:06,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:58:11,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:58:11,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:58:14,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 23:58:18,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 23:58:18,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:58:18,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:20,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:20,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:58:25,029 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.262e+02 2.553e+02 3.079e+02 4.621e+02, threshold=5.106e+02, percent-clipped=0.0 2023-09-28 23:58:25,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 23:58:26,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:58:29,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:29,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:29,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 23:58:32,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:32,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:58:32,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 23:58:34,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=183666.66666666666, ans=0.0 2023-09-28 23:58:36,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:58:38,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:43,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:58:43,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=183666.66666666666, ans=0.0 2023-09-28 23:58:44,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 23:58:44,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 23:58:49,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:49,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=183733.33333333334, ans=0.0 2023-09-28 23:58:51,488 INFO [train.py:1039] (2/4) Epoch 6, batch 1000, loss[loss=0.2246, simple_loss=0.2897, pruned_loss=0.07978, over 24641.00 frames. ], tot_loss[loss=0.2412, simple_loss=0.3013, pruned_loss=0.09054, over 4699892.09 frames. ], batch size: 60, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:58:53,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 23:58:54,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:00,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:59:00,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.08 vs. limit=15.0 2023-09-28 23:59:01,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 23:59:01,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 23:59:08,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:08,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:59:08,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:08,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=183800.0, ans=0.025 2023-09-28 23:59:09,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=183800.0, ans=0.04949747468305833 2023-09-28 23:59:12,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=183800.0, ans=0.125 2023-09-28 23:59:13,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 23:59:16,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 23:59:19,036 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.30 vs. limit=15.0 2023-09-28 23:59:19,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 23:59:19,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:21,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 23:59:22,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 23:59:22,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 23:59:24,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:24,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=183866.66666666666, ans=0.2 2023-09-28 23:59:26,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:26,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=183866.66666666666, ans=0.125 2023-09-28 23:59:29,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=183866.66666666666, ans=0.125 2023-09-28 23:59:35,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:35,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:59:37,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:38,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:38,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 23:59:38,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:40,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:59:40,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:41,696 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 23:59:44,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 23:59:44,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=183933.33333333334, ans=0.1 2023-09-28 23:59:45,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 23:59:48,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 23:59:50,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:59:55,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:56,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:59:56,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:56,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=184000.0, ans=0.125 2023-09-28 23:59:58,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:00:00,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 00:00:03,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:00:03,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 00:00:04,512 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.27 vs. limit=12.0 2023-09-29 00:00:05,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 00:00:07,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:07,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:00:09,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:00:12,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:00:15,325 INFO [train.py:1039] (2/4) Epoch 6, batch 1050, loss[loss=0.2494, simple_loss=0.319, pruned_loss=0.08987, over 24335.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3003, pruned_loss=0.08951, over 4703253.15 frames. ], batch size: 77, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:00:15,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:00:19,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:00:20,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:00:22,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:00:24,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:24,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=184066.66666666666, ans=0.05 2023-09-29 00:00:25,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:27,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:00:28,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:00:30,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:00:32,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:00:32,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:00:33,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:00:33,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 00:00:34,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=184133.33333333334, ans=0.2 2023-09-29 00:00:36,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:36,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 00:00:39,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:39,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 00:00:40,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:00:47,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:48,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:00:48,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:48,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=184200.0, ans=0.035 2023-09-29 00:00:50,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 00:00:52,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 00:00:52,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:53,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.31 vs. limit=6.0 2023-09-29 00:00:54,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 00:00:57,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 00:00:57,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=184200.0, ans=0.0 2023-09-29 00:00:58,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:00,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:01:01,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:01:01,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:02,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:01:02,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=184200.0, ans=0.0 2023-09-29 00:01:08,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:01:12,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 00:01:14,312 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.070e+02 2.251e+02 2.597e+02 4.023e+02, threshold=4.502e+02, percent-clipped=0.0 2023-09-29 00:01:14,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 00:01:14,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 00:01:16,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:16,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:01:17,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 00:01:22,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:01:24,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:24,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:01:24,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:24,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:30,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.89 vs. limit=15.0 2023-09-29 00:01:30,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:30,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 00:01:32,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:32,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 00:01:32,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 00:01:33,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:01:36,202 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.03 vs. limit=15.0 2023-09-29 00:01:38,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:01:39,990 INFO [train.py:1039] (2/4) Epoch 6, batch 1100, loss[loss=0.2226, simple_loss=0.2901, pruned_loss=0.07753, over 24627.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.2992, pruned_loss=0.08851, over 4704309.06 frames. ], batch size: 65, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:01:43,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:01:49,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:01:51,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:01:51,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:01:51,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 00:01:53,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:56,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:01:58,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:02:01,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:02:01,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 00:02:03,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:02:04,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:04,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:02:07,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:02:08,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=184466.66666666666, ans=0.0 2023-09-29 00:02:09,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:02:14,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:02:17,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 00:02:18,005 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 00:02:20,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:23,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:23,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:02:25,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:02:26,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 00:02:28,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:02:28,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:02:28,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:02:28,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:28,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 00:02:32,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=184600.0, ans=0.125 2023-09-29 00:02:35,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:02:35,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 00:02:36,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:02:41,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:02:44,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 00:02:44,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:02:46,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:48,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:49,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:51,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 00:02:53,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:02:53,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:55,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 00:02:55,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:02:57,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 00:02:58,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:02:58,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:03:00,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:03:04,731 INFO [train.py:1039] (2/4) Epoch 6, batch 1150, loss[loss=0.2639, simple_loss=0.3149, pruned_loss=0.1064, over 23440.00 frames. ], tot_loss[loss=0.239, simple_loss=0.2999, pruned_loss=0.08903, over 4706614.11 frames. ], batch size: 285, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:03:06,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:10,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:03:11,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:11,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:03:11,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 00:03:11,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=184733.33333333334, ans=0.0 2023-09-29 00:03:13,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:14,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 00:03:16,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:16,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:03:21,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 00:03:24,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:29,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:30,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.18 vs. limit=22.5 2023-09-29 00:03:31,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:32,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 00:03:32,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:03:32,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:35,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 00:03:37,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:38,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:48,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 00:03:56,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:03:56,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:00,838 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.083e+02 2.291e+02 2.736e+02 4.000e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-29 00:04:02,584 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 00:04:05,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:15,219 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 00:04:15,615 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:04:17,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_na.min_abs, batch_count=185000.0, ans=0.02 2023-09-29 00:04:19,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:21,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:04:21,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:04:23,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:04:26,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:27,141 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.30 vs. limit=15.0 2023-09-29 00:04:27,752 INFO [train.py:1039] (2/4) Epoch 6, batch 1200, loss[loss=0.2175, simple_loss=0.2834, pruned_loss=0.07577, over 24631.00 frames. ], tot_loss[loss=0.2389, simple_loss=0.3003, pruned_loss=0.08875, over 4706351.71 frames. ], batch size: 60, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:04:32,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:04:32,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:04:33,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:33,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:33,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:04:35,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:04:36,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:04:40,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:40,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:41,043 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.84 vs. limit=12.0 2023-09-29 00:04:43,772 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 00:04:47,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 00:04:51,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:04:54,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:04:58,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:59,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:04:59,557 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 00:05:01,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:01,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=185200.0, ans=0.05 2023-09-29 00:05:07,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:05:07,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:05:07,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 00:05:07,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:05:11,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 00:05:16,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 00:05:16,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:18,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:05:20,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:21,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:05:22,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:05:22,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:05:24,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:05:24,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 00:05:24,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:05:25,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:25,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:05:29,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:29,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:33,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:05:35,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:05:37,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=185333.33333333334, ans=0.0 2023-09-29 00:05:37,902 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.65 vs. limit=12.0 2023-09-29 00:05:39,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 00:05:42,984 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 00:05:44,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:05:45,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:47,248 INFO [train.py:1039] (2/4) Epoch 6, batch 1250, loss[loss=0.2894, simple_loss=0.3301, pruned_loss=0.1244, over 22728.00 frames. ], tot_loss[loss=0.2401, simple_loss=0.3018, pruned_loss=0.08924, over 4716139.22 frames. ], batch size: 322, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:05:47,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=185400.0, ans=0.05 2023-09-29 00:05:48,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:05:51,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:54,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 00:05:57,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:05:58,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:05:59,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 00:06:00,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:06:02,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:06:02,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=185466.66666666666, ans=0.125 2023-09-29 00:06:03,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.74 vs. limit=22.5 2023-09-29 00:06:05,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:06:08,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:08,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:06:08,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:11,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:06:15,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:06:15,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:06:15,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:06:17,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:17,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:20,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:22,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:06:29,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 00:06:30,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:06:34,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:35,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 00:06:35,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:36,619 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 00:06:36,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:36,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:38,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,094 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.172e+02 2.410e+02 2.804e+02 3.996e+02, threshold=4.819e+02, percent-clipped=0.0 2023-09-29 00:06:42,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:06:43,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 00:06:43,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 00:06:43,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 00:06:48,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:06:49,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 00:06:49,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:52,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:06:52,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:06:55,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 00:06:56,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:06:56,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:06:56,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:06:57,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:57,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 00:07:02,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:03,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:07:03,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=185666.66666666666, ans=0.125 2023-09-29 00:07:04,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:07:06,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:07:07,753 INFO [train.py:1039] (2/4) Epoch 6, batch 1300, loss[loss=0.2677, simple_loss=0.3212, pruned_loss=0.1071, over 23275.00 frames. ], tot_loss[loss=0.2399, simple_loss=0.3016, pruned_loss=0.08906, over 4718628.03 frames. ], batch size: 105, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:07:11,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:11,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 00:07:14,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:14,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=185733.33333333334, ans=0.125 2023-09-29 00:07:15,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:07:17,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:17,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=185733.33333333334, ans=0.2 2023-09-29 00:07:18,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:07:20,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:07:21,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 00:07:25,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:07:27,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:07:29,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 00:07:33,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:07:37,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:37,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:07:38,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:38,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:40,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:07:40,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:07:41,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=185866.66666666666, ans=0.125 2023-09-29 00:07:42,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 00:07:45,910 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.71 vs. limit=12.0 2023-09-29 00:07:48,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:07:48,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:07:49,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 00:07:49,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:07:52,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:07:55,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:57,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 00:07:58,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:07:58,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 00:07:58,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:01,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=185933.33333333334, ans=0.2 2023-09-29 00:08:02,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:02,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:08:04,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 00:08:06,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 00:08:07,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 00:08:08,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=185933.33333333334, ans=0.125 2023-09-29 00:08:11,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:08:14,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 00:08:16,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=186000.0, ans=0.2 2023-09-29 00:08:17,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:24,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 00:08:27,632 INFO [train.py:1039] (2/4) Epoch 6, batch 1350, loss[loss=0.2236, simple_loss=0.292, pruned_loss=0.0776, over 24666.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.2998, pruned_loss=0.08857, over 4718084.84 frames. ], batch size: 65, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:08:27,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:29,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:32,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:33,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:36,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:08:36,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:39,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:40,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=186066.66666666666, ans=0.125 2023-09-29 00:08:41,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 00:08:43,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:08:44,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:08:47,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 00:08:49,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:50,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:50,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 00:08:50,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=186133.33333333334, ans=0.2 2023-09-29 00:08:53,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 00:08:55,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 00:08:58,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:58,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 00:09:08,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=186200.0, ans=0.125 2023-09-29 00:09:11,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:20,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=186266.66666666666, ans=0.0 2023-09-29 00:09:21,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:21,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=186266.66666666666, ans=0.1 2023-09-29 00:09:22,613 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.119e+02 2.458e+02 2.800e+02 4.358e+02, threshold=4.916e+02, percent-clipped=0.0 2023-09-29 00:09:22,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:22,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 00:09:25,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:27,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 00:09:27,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:09:28,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:09:31,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:09:33,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 00:09:34,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:09:39,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 00:09:43,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 00:09:47,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=186400.0, ans=0.125 2023-09-29 00:09:48,755 INFO [train.py:1039] (2/4) Epoch 6, batch 1400, loss[loss=0.2141, simple_loss=0.2822, pruned_loss=0.073, over 24573.00 frames. ], tot_loss[loss=0.2377, simple_loss=0.2989, pruned_loss=0.08821, over 4723775.77 frames. ], batch size: 60, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:09:48,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 00:09:50,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:53,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:09:55,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:09:58,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 00:10:00,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 00:10:10,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:10:12,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:12,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=186466.66666666666, ans=0.125 2023-09-29 00:10:16,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:10:17,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:10:21,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:10:22,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 00:10:30,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:30,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:34,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=186533.33333333334, ans=0.125 2023-09-29 00:10:35,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 00:10:36,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:10:36,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:10:39,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:10:39,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:41,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:10:42,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:10:42,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:10:44,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 00:10:45,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:10:48,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=186600.0, ans=0.125 2023-09-29 00:10:49,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:56,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:11:03,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 00:11:05,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:11:06,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:11:09,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 00:11:09,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=186666.66666666666, ans=0.125 2023-09-29 00:11:10,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:11,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:11:12,498 INFO [train.py:1039] (2/4) Epoch 6, batch 1450, loss[loss=0.2304, simple_loss=0.307, pruned_loss=0.07694, over 24507.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.2981, pruned_loss=0.08744, over 4722131.16 frames. ], batch size: 66, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:11:15,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:11:15,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:11:17,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:17,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:11:22,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:23,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:11:25,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:11:25,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 00:11:27,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:11:27,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 00:11:29,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:30,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:30,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 00:11:32,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:11:32,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:11:34,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 00:11:34,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:36,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:11:37,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:39,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:43,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:11:43,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:11:45,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:46,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:48,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:48,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:11:48,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:49,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:11:51,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=186866.66666666666, ans=0.125 2023-09-29 00:11:51,638 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.12 vs. limit=6.0 2023-09-29 00:11:53,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 00:11:56,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:11:59,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=186933.33333333334, ans=0.0 2023-09-29 00:12:00,700 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 00:12:02,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:04,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:12:06,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:08,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 00:12:09,774 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.135e+02 2.497e+02 3.099e+02 5.077e+02, threshold=4.994e+02, percent-clipped=1.0 2023-09-29 00:12:12,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:14,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 00:12:14,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 00:12:15,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:17,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=187000.0, ans=0.05 2023-09-29 00:12:18,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:20,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:20,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 00:12:23,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 00:12:23,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 00:12:25,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:26,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:12:32,991 INFO [train.py:1039] (2/4) Epoch 6, batch 1500, loss[loss=0.2358, simple_loss=0.3048, pruned_loss=0.08345, over 24454.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.2988, pruned_loss=0.08766, over 4731099.72 frames. ], batch size: 69, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:12:33,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=187066.66666666666, ans=0.125 2023-09-29 00:12:38,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 00:12:38,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:12:38,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:12:40,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:41,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:42,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:12:43,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 00:12:45,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:12:45,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:12:45,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:46,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:48,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:12:49,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:52,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:54,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 00:12:54,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:12:55,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:12:55,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:58,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 00:12:58,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=187133.33333333334, ans=0.0 2023-09-29 00:13:03,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 00:13:06,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:13:06,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 00:13:11,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:13:13,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:14,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:13:14,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:13:16,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 00:13:16,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:13:16,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:17,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 00:13:17,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:19,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=187200.0, ans=0.0 2023-09-29 00:13:25,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:13:25,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 00:13:28,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:13:31,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:13:35,665 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 00:13:35,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:37,105 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 00:13:37,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:13:38,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:13:40,796 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 00:13:42,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:13:46,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 00:13:47,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:52,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:53,964 INFO [train.py:1039] (2/4) Epoch 6, batch 1550, loss[loss=0.249, simple_loss=0.3005, pruned_loss=0.09876, over 23322.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.2997, pruned_loss=0.08831, over 4717764.50 frames. ], batch size: 105, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:13:54,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:54,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:55,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 00:13:55,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 00:13:55,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:13:57,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 00:13:57,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 00:14:00,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:01,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:03,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:03,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:14:03,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:04,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:07,756 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 00:14:07,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:07,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:14:09,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:14:10,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:14:11,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 00:14:12,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:12,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=187466.66666666666, ans=0.0 2023-09-29 00:14:14,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 00:14:14,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 00:14:14,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 00:14:16,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:18,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:21,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:14:22,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=187466.66666666666, ans=0.1 2023-09-29 00:14:24,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 00:14:24,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 00:14:26,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=187533.33333333334, ans=0.125 2023-09-29 00:14:33,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:36,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:36,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:14:36,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:14:36,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 00:14:41,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:14:42,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:45,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:14:49,281 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.090e+02 2.378e+02 2.713e+02 3.704e+02, threshold=4.756e+02, percent-clipped=0.0 2023-09-29 00:14:49,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:14:49,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:49,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 00:14:49,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:14:51,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:14:51,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:53,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:14:53,744 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 00:14:56,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:01,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 00:15:07,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:07,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=187666.66666666666, ans=0.0 2023-09-29 00:15:08,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:09,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 00:15:10,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:15:12,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:12,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:15:12,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:15:13,493 INFO [train.py:1039] (2/4) Epoch 6, batch 1600, loss[loss=0.2442, simple_loss=0.3029, pruned_loss=0.09276, over 23646.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3008, pruned_loss=0.08938, over 4712774.73 frames. ], batch size: 106, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:15:13,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:15:16,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:17,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 00:15:18,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 00:15:21,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 00:15:25,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:15:25,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=187733.33333333334, ans=0.125 2023-09-29 00:15:26,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 00:15:28,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:15:30,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:15:36,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:15:40,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 00:15:42,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:15:43,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 00:15:45,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:45,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 00:15:50,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 00:15:58,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:59,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 00:15:59,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:16:01,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:01,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:16:01,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=187933.33333333334, ans=0.125 2023-09-29 00:16:02,254 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.68 vs. limit=10.0 2023-09-29 00:16:05,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:16:06,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=187933.33333333334, ans=0.07 2023-09-29 00:16:09,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:16:10,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:16:11,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:16:14,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:16:14,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=187933.33333333334, ans=0.2 2023-09-29 00:16:17,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:16:18,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:16:20,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=188000.0, ans=0.0 2023-09-29 00:16:23,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:25,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:16:27,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 00:16:27,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:16:28,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 00:16:33,096 INFO [train.py:1039] (2/4) Epoch 6, batch 1650, loss[loss=0.2329, simple_loss=0.2818, pruned_loss=0.09197, over 23684.00 frames. ], tot_loss[loss=0.2417, simple_loss=0.3025, pruned_loss=0.09041, over 4719073.05 frames. ], batch size: 232, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:16:36,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:37,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:16:37,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:16:39,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 00:16:39,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 00:16:39,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 00:16:39,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 00:16:42,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:43,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:44,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:16:44,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:16:47,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:48,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 00:16:51,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:51,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:51,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:16:51,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:16:53,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 00:16:53,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 00:16:58,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:16:59,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:17:08,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 00:17:10,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:12,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 00:17:16,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:19,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:17:19,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:17:20,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:21,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:17:21,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:25,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:25,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:27,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:27,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:28,445 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 2.178e+02 2.493e+02 2.802e+02 6.343e+02, threshold=4.987e+02, percent-clipped=2.0 2023-09-29 00:17:28,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:28,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:17:31,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:33,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 00:17:34,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:35,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 00:17:37,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 00:17:37,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 00:17:39,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:39,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:17:40,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:40,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:40,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 00:17:45,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:46,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:17:46,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:50,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 00:17:52,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.39 vs. limit=22.5 2023-09-29 00:17:53,482 INFO [train.py:1039] (2/4) Epoch 6, batch 1700, loss[loss=0.2299, simple_loss=0.308, pruned_loss=0.07586, over 24676.00 frames. ], tot_loss[loss=0.2408, simple_loss=0.3017, pruned_loss=0.08997, over 4718692.63 frames. ], batch size: 73, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:17:55,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:55,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:17:55,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 00:17:55,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:17:56,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:17:56,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:58,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=188400.0, ans=0.125 2023-09-29 00:17:59,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:17:59,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:17:59,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 00:18:02,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:18:09,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:18:13,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:18:19,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:18:21,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:21,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:18:21,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:24,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 00:18:24,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:18:25,279 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=16.10 vs. limit=15.0 2023-09-29 00:18:25,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:26,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=188533.33333333334, ans=0.2 2023-09-29 00:18:27,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:18:27,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:18:29,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 00:18:30,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 00:18:32,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:32,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=188533.33333333334, ans=0.2 2023-09-29 00:18:33,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 00:18:35,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:18:44,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:18:46,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:47,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:49,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:18:49,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 00:18:49,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:51,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:51,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 00:18:53,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:18:53,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:53,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:53,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:18:56,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:56,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:18:57,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:57,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:18:57,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:18:59,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=188666.66666666666, ans=0.125 2023-09-29 00:19:02,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:05,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 00:19:05,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:19:05,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=188666.66666666666, ans=0.0 2023-09-29 00:19:07,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:08,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 00:19:16,067 INFO [train.py:1039] (2/4) Epoch 6, batch 1750, loss[loss=0.224, simple_loss=0.2781, pruned_loss=0.08497, over 23597.00 frames. ], tot_loss[loss=0.2388, simple_loss=0.2991, pruned_loss=0.08926, over 4713688.61 frames. ], batch size: 256, lr: 1.72e-02, grad_scale: 32.0 2023-09-29 00:19:17,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:21,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:21,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:19:22,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 00:19:22,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:19:26,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:19:26,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:28,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=188733.33333333334, ans=0.1 2023-09-29 00:19:29,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 00:19:32,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:35,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 00:19:35,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:19:37,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:19:38,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:19:39,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=188800.0, ans=0.1 2023-09-29 00:19:40,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 00:19:43,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:19:43,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 00:19:50,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=188866.66666666666, ans=0.0 2023-09-29 00:19:53,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:19:55,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=188866.66666666666, ans=0.125 2023-09-29 00:19:57,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:19:57,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:00,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:00,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:01,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=188866.66666666666, ans=0.125 2023-09-29 00:20:02,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=188866.66666666666, ans=0.1 2023-09-29 00:20:03,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:05,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:07,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:07,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:20:07,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=188933.33333333334, ans=0.07 2023-09-29 00:20:08,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 00:20:10,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:12,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 00:20:13,402 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.714e+02 2.157e+02 2.511e+02 2.908e+02 4.872e+02, threshold=5.023e+02, percent-clipped=0.0 2023-09-29 00:20:13,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:13,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=188933.33333333334, ans=0.0 2023-09-29 00:20:14,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.01 vs. limit=15.0 2023-09-29 00:20:16,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:18,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:20:21,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:20:23,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 00:20:23,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:23,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=189000.0, ans=0.125 2023-09-29 00:20:25,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:30,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:34,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:20:35,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:20:35,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=189000.0, ans=0.05 2023-09-29 00:20:37,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 00:20:37,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:38,617 INFO [train.py:1039] (2/4) Epoch 6, batch 1800, loss[loss=0.253, simple_loss=0.3005, pruned_loss=0.1027, over 23891.00 frames. ], tot_loss[loss=0.2367, simple_loss=0.2977, pruned_loss=0.08782, over 4724481.18 frames. ], batch size: 212, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:20:38,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:20:38,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:38,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:20:38,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:20:40,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:20:41,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.55 vs. limit=6.0 2023-09-29 00:20:42,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:20:43,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:45,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:20:48,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:51,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:20:52,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:54,357 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:20:55,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:20:57,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:59,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:59,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:21:03,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:21:03,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 00:21:03,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=189133.33333333334, ans=0.1 2023-09-29 00:21:04,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=16.06 vs. limit=22.5 2023-09-29 00:21:04,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:08,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:12,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 00:21:15,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 00:21:15,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 00:21:15,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:17,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:21:17,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:21:19,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:21:23,916 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 00:21:25,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:21:28,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:29,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 00:21:31,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 00:21:32,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:21:33,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:21:35,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:21:39,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 00:21:45,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=189333.33333333334, ans=10.0 2023-09-29 00:21:48,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:21:48,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 00:21:48,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:21:48,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:49,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:21:50,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 00:21:53,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:21:53,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:21:53,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=189333.33333333334, ans=0.125 2023-09-29 00:21:56,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 00:21:56,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:59,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:21:59,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:21:59,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:59,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:22:00,771 INFO [train.py:1039] (2/4) Epoch 6, batch 1850, loss[loss=0.2427, simple_loss=0.3086, pruned_loss=0.0884, over 23951.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2981, pruned_loss=0.08733, over 4725992.72 frames. ], batch size: 86, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:22:00,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:22:02,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:22:02,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:06,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:22:06,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:10,465 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.06 vs. limit=6.0 2023-09-29 00:22:14,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:22:16,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 00:22:18,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 00:22:22,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 00:22:25,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:22:27,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 00:22:27,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 00:22:28,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.31 vs. limit=22.5 2023-09-29 00:22:36,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:22:40,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 00:22:41,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:22:43,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:22:48,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 00:22:49,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:49,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:22:49,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:22:53,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:56,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:59,536 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.153e+02 2.382e+02 2.790e+02 3.964e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 00:22:59,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:22:59,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:59,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:23:01,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:02,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:02,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:23:06,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 00:23:07,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:11,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:23:11,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:23:11,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 00:23:11,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 00:23:14,261 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 00:23:14,398 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 00:23:17,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:23:17,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:23:17,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:17,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:19,332 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 00:23:19,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:23:19,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:19,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:23:21,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:23:22,514 INFO [train.py:1039] (2/4) Epoch 6, batch 1900, loss[loss=0.247, simple_loss=0.3032, pruned_loss=0.09541, over 23324.00 frames. ], tot_loss[loss=0.2375, simple_loss=0.2991, pruned_loss=0.08794, over 4716599.00 frames. ], batch size: 119, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:23:22,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:23:22,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 00:23:26,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:26,221 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 00:23:26,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:23:28,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:31,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=189733.33333333334, ans=0.0 2023-09-29 00:23:32,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:35,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:23:37,390 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 00:23:37,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 00:23:39,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:40,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:23:40,642 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 00:23:40,693 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 00:23:45,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 00:23:47,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:23:50,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 00:23:53,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 00:24:04,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 00:24:07,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 00:24:07,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:07,240 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 00:24:07,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 00:24:07,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 00:24:07,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=189866.66666666666, ans=0.125 2023-09-29 00:24:08,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 00:24:08,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:24:13,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 00:24:17,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:24:18,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=189933.33333333334, ans=0.125 2023-09-29 00:24:19,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=189933.33333333334, ans=0.0 2023-09-29 00:24:20,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:20,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 00:24:24,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:24:27,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 00:24:27,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:33,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:24:33,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:24:33,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:24:34,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:24:36,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:24:37,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:24:37,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:24:39,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:39,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:24:42,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:24:42,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:42,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:45,067 INFO [train.py:1039] (2/4) Epoch 6, batch 1950, loss[loss=0.2346, simple_loss=0.3065, pruned_loss=0.08132, over 24667.00 frames. ], tot_loss[loss=0.2386, simple_loss=0.3005, pruned_loss=0.08837, over 4726295.67 frames. ], batch size: 73, lr: 1.72e-02, grad_scale: 8.0 2023-09-29 00:24:45,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:45,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=190066.66666666666, ans=0.125 2023-09-29 00:24:49,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:24:51,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:24:51,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:51,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:24:52,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 00:24:54,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:24:54,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:56,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:58,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:24:59,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:24:59,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:01,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:06,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:25:06,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=190133.33333333334, ans=0.0 2023-09-29 00:25:08,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:25:08,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:25:08,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:11,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:14,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:25:14,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:14,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:25:14,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 00:25:14,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:25:14,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:25:16,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:20,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:22,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:25:26,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:25:30,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:25:30,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:25:32,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 00:25:32,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:25:36,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:40,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:25:40,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:25:46,177 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.214e+02 2.605e+02 2.904e+02 4.592e+02, threshold=5.209e+02, percent-clipped=0.0 2023-09-29 00:25:49,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:51,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:54,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:56,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:58,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=190333.33333333334, ans=0.125 2023-09-29 00:25:59,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:26:00,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:26:01,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 00:26:01,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:26:01,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:03,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 00:26:04,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=190333.33333333334, ans=0.2 2023-09-29 00:26:06,580 INFO [train.py:1039] (2/4) Epoch 6, batch 2000, loss[loss=0.2089, simple_loss=0.2803, pruned_loss=0.06879, over 24293.00 frames. ], tot_loss[loss=0.2407, simple_loss=0.302, pruned_loss=0.08965, over 4717238.75 frames. ], batch size: 61, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:26:06,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:09,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:26:09,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:26:10,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=190400.0, ans=0.125 2023-09-29 00:26:11,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:13,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:26:13,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=190400.0, ans=0.05 2023-09-29 00:26:15,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:15,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=190400.0, ans=0.2 2023-09-29 00:26:18,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 00:26:18,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:26:23,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:26:25,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 00:26:26,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:26:26,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:29,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:26:30,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 00:26:32,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:36,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 00:26:36,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:26:38,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 00:26:38,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:41,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:26:41,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:26:41,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:42,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:26:44,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:26:44,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=190533.33333333334, ans=0.1 2023-09-29 00:26:45,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 00:26:49,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 00:26:49,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:50,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:55,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:56,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:26:57,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:26:58,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:58,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=190600.0, ans=0.2 2023-09-29 00:26:59,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:00,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:00,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:27:01,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:03,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:06,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:27:06,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=190600.0, ans=0.125 2023-09-29 00:27:08,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 00:27:15,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:27:15,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:18,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:18,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:27:21,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:23,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:23,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:24,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=190666.66666666666, ans=0.2 2023-09-29 00:27:25,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:27:25,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:27:29,355 INFO [train.py:1039] (2/4) Epoch 6, batch 2050, loss[loss=0.2161, simple_loss=0.2623, pruned_loss=0.08491, over 23533.00 frames. ], tot_loss[loss=0.2401, simple_loss=0.3009, pruned_loss=0.08963, over 4711359.87 frames. ], batch size: 256, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:27:29,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:30,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:32,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:32,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:38,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:40,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:27:40,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:42,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:27:44,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=190800.0, ans=0.125 2023-09-29 00:27:45,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 00:27:45,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:27:47,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:47,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:27:51,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=190800.0, ans=0.1 2023-09-29 00:27:56,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:27:56,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:28:00,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 00:28:03,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:28:05,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 00:28:05,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:28:07,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:10,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:11,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:28:12,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:14,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:28:16,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:28:16,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:28:19,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:21,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:28:24,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:28:25,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:28:26,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=190933.33333333334, ans=0.125 2023-09-29 00:28:29,264 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.164e+02 2.484e+02 2.839e+02 4.579e+02, threshold=4.968e+02, percent-clipped=0.0 2023-09-29 00:28:29,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:35,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:28:35,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 00:28:41,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:42,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:28:44,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:28:47,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 00:28:50,061 INFO [train.py:1039] (2/4) Epoch 6, batch 2100, loss[loss=0.2456, simple_loss=0.2965, pruned_loss=0.09738, over 23583.00 frames. ], tot_loss[loss=0.2387, simple_loss=0.2999, pruned_loss=0.08874, over 4727022.46 frames. ], batch size: 256, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:28:50,299 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 00:28:50,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:28:50,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:51,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:28:53,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:53,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 00:28:54,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 00:28:56,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:59,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:28:59,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:29:03,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:05,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:29:05,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 00:29:05,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:29:06,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.93 vs. limit=12.0 2023-09-29 00:29:07,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 00:29:07,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 00:29:08,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:09,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:09,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 00:29:10,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 00:29:15,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 00:29:15,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:29:19,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:29:21,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:29:22,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:29:24,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 00:29:24,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:24,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:29:26,685 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.14 vs. limit=15.0 2023-09-29 00:29:28,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 00:29:29,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:29,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 00:29:29,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 00:29:29,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 00:29:32,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:29:34,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:29:34,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=191200.0, ans=0.125 2023-09-29 00:29:37,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:37,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:40,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:42,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:42,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 00:29:42,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:42,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:43,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:43,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 00:29:45,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 00:29:45,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 00:29:46,378 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.01 vs. limit=15.0 2023-09-29 00:29:48,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:29:52,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:52,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 00:29:59,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:01,764 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.72 vs. limit=10.0 2023-09-29 00:30:02,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:30:02,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:02,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:02,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:30:04,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:06,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:06,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:30:07,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:30:07,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:09,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 00:30:10,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 00:30:10,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:12,259 INFO [train.py:1039] (2/4) Epoch 6, batch 2150, loss[loss=0.2098, simple_loss=0.2786, pruned_loss=0.07045, over 24432.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2981, pruned_loss=0.08758, over 4721536.44 frames. ], batch size: 58, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:30:14,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:30:14,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:30:14,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:30:14,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:30:21,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:30:24,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:24,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:24,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=191400.0, ans=0.125 2023-09-29 00:30:25,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:30:25,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:25,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:30:29,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:30,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:30:30,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:30:34,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:34,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 00:30:34,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=191466.66666666666, ans=0.125 2023-09-29 00:30:39,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:39,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=191466.66666666666, ans=0.1 2023-09-29 00:30:40,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:30:41,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:42,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:30:44,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:44,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:45,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:47,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 00:30:47,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:30:47,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=191533.33333333334, ans=0.125 2023-09-29 00:30:49,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:49,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:51,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:52,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:30:54,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:55,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:30:57,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:57,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 00:30:57,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:30:57,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=191533.33333333334, ans=0.125 2023-09-29 00:31:00,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:00,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:02,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:02,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:31:03,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:05,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:05,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 00:31:06,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=191600.0, ans=0.1 2023-09-29 00:31:07,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 00:31:07,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:31:08,631 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 00:31:08,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:10,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:31:12,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 00:31:12,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:31:12,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 00:31:12,149 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 00:31:12,149 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 00:31:13,511 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.175e+02 2.371e+02 2.778e+02 4.132e+02, threshold=4.742e+02, percent-clipped=0.0 2023-09-29 00:31:13,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 00:31:15,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:15,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:31:16,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:31:16,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:18,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:31:19,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:19,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:30,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:31:31,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 00:31:34,636 INFO [train.py:1039] (2/4) Epoch 6, batch 2200, loss[loss=0.2207, simple_loss=0.2982, pruned_loss=0.07164, over 24468.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2984, pruned_loss=0.08743, over 4727781.87 frames. ], batch size: 66, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:31:34,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:31:35,600 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.58 vs. limit=22.5 2023-09-29 00:31:39,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:40,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:31:40,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:31:44,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:31:46,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:47,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:31:47,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 00:31:54,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 00:31:55,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:32:01,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 00:32:05,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:07,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:07,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:32:11,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:32:11,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 00:32:16,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:32:16,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:18,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 00:32:21,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:32:23,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:24,204 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.36 vs. limit=15.0 2023-09-29 00:32:25,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:32:26,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:28,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 00:32:29,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:30,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 00:32:30,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=191933.33333333334, ans=0.035 2023-09-29 00:32:32,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:32,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:32:32,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:36,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:37,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:37,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:37,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:39,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:32:40,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:32:42,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:32:45,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:32:45,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:32:47,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:32:48,730 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 00:32:49,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=192000.0, ans=0.125 2023-09-29 00:32:50,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:32:51,705 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 00:32:51,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:32:52,560 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 00:32:53,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:55,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:32:56,910 INFO [train.py:1039] (2/4) Epoch 6, batch 2250, loss[loss=0.2328, simple_loss=0.2925, pruned_loss=0.08655, over 23461.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2987, pruned_loss=0.08724, over 4732044.08 frames. ], batch size: 120, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:32:57,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=192066.66666666666, ans=0.125 2023-09-29 00:32:59,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:59,185 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 00:33:00,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:33:04,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:09,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=192066.66666666666, ans=0.0 2023-09-29 00:33:11,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:33:11,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:33:11,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=192066.66666666666, ans=0.125 2023-09-29 00:33:14,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:15,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:17,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:20,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 00:33:20,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:20,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:33:23,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 00:33:23,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:33:24,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:26,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:31,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:33,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:33:33,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:33:35,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 00:33:36,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:40,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:33:44,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:45,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:47,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:33:47,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:48,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:50,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:33:54,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:33:56,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:33:56,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=192266.66666666666, ans=0.125 2023-09-29 00:33:57,684 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 2.089e+02 2.370e+02 2.766e+02 4.098e+02, threshold=4.740e+02, percent-clipped=0.0 2023-09-29 00:34:00,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=12.0 2023-09-29 00:34:02,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:34:04,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:34:04,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:34:06,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=192333.33333333334, ans=0.125 2023-09-29 00:34:09,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:34:13,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:34:13,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 00:34:13,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=192333.33333333334, ans=0.05 2023-09-29 00:34:14,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:14,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:34:18,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 00:34:19,575 INFO [train.py:1039] (2/4) Epoch 6, batch 2300, loss[loss=0.2458, simple_loss=0.2945, pruned_loss=0.09857, over 23748.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.2986, pruned_loss=0.08683, over 4731991.98 frames. ], batch size: 179, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:34:19,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:34:19,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:25,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:25,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:34:27,411 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 00:34:30,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:34,369 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.52 vs. limit=6.0 2023-09-29 00:34:38,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:34:38,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:34:38,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:34:40,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:40,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 00:34:41,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:34:46,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:34:46,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:34:50,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:34:54,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:34:57,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:35:01,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:35:03,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:35:06,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:35:06,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=192600.0, ans=0.125 2023-09-29 00:35:07,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:35:11,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:35:11,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:35:13,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:35:13,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 00:35:17,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:35:17,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:17,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:35:17,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:35:18,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:19,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:35:19,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:35:19,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 00:35:22,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:35:22,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:22,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 00:35:30,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:35:31,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:35:37,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:37,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:35:39,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:35:39,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.04 vs. limit=15.0 2023-09-29 00:35:40,481 INFO [train.py:1039] (2/4) Epoch 6, batch 2350, loss[loss=0.2347, simple_loss=0.2838, pruned_loss=0.09281, over 23427.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.2987, pruned_loss=0.0868, over 4733047.99 frames. ], batch size: 285, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:35:40,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:35:40,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:35:42,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:35:44,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 00:35:50,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:35:50,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 00:35:57,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 00:35:59,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:36:01,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:03,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:03,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 00:36:07,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:36:12,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 00:36:13,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:16,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:36:16,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:36:19,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:36:22,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 00:36:22,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:36:23,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:23,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:25,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:36:27,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=192866.66666666666, ans=0.1 2023-09-29 00:36:29,754 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.07 vs. limit=22.5 2023-09-29 00:36:30,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:36:32,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 00:36:33,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:36:36,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:37,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:36:39,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 00:36:39,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:36:40,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 00:36:41,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:36:42,239 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 2.125e+02 2.359e+02 2.805e+02 3.859e+02, threshold=4.718e+02, percent-clipped=0.0 2023-09-29 00:36:46,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 00:36:49,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 00:36:50,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:50,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:36:50,836 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 00:36:52,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 00:36:52,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 00:36:56,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:37:01,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:37:02,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.59 vs. limit=10.0 2023-09-29 00:37:03,299 INFO [train.py:1039] (2/4) Epoch 6, batch 2400, loss[loss=0.2049, simple_loss=0.2715, pruned_loss=0.06912, over 24310.00 frames. ], tot_loss[loss=0.236, simple_loss=0.2982, pruned_loss=0.08694, over 4725573.05 frames. ], batch size: 56, lr: 1.71e-02, grad_scale: 32.0 2023-09-29 00:37:06,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:37:07,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=193066.66666666666, ans=0.125 2023-09-29 00:37:08,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:37:10,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 00:37:10,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 00:37:17,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:37:17,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:37:20,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 00:37:22,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:37:23,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:23,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 00:37:23,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=193133.33333333334, ans=0.2 2023-09-29 00:37:30,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:32,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 00:37:32,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=193133.33333333334, ans=0.125 2023-09-29 00:37:37,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:37:40,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 00:37:43,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:37:45,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:49,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:37:50,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 00:37:50,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:37:59,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:02,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:05,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:05,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:38:05,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:38:05,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:38:05,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:06,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:07,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:38:11,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.61 vs. limit=10.0 2023-09-29 00:38:12,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:12,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:38:12,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 00:38:14,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 00:38:18,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:38:18,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:18,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 00:38:19,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 00:38:19,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 00:38:19,790 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 00:38:21,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 00:38:21,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:38:23,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:24,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:25,943 INFO [train.py:1039] (2/4) Epoch 6, batch 2450, loss[loss=0.2178, simple_loss=0.2859, pruned_loss=0.07482, over 24315.00 frames. ], tot_loss[loss=0.2346, simple_loss=0.2968, pruned_loss=0.08622, over 4718336.69 frames. ], batch size: 61, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:38:25,996 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 00:38:26,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:28,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:38:31,987 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.85 vs. limit=15.0 2023-09-29 00:38:32,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:38:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:35,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:37,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:37,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 00:38:38,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.41 vs. limit=22.5 2023-09-29 00:38:43,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:43,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:48,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:38:48,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:38:48,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:38:49,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 00:38:53,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:55,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:38:56,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:59,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:38:59,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:01,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:03,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:39:03,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=193533.33333333334, ans=0.125 2023-09-29 00:39:04,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 00:39:04,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:39:14,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:15,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:39:16,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:16,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:39:18,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:18,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:39:19,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 00:39:23,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:23,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:39:26,707 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.234e+02 2.563e+02 3.066e+02 5.570e+02, threshold=5.125e+02, percent-clipped=5.0 2023-09-29 00:39:26,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:39:26,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:31,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:39:32,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 00:39:34,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:39:34,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:39:34,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 00:39:34,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:39:36,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:39:37,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=193666.66666666666, ans=0.0 2023-09-29 00:39:39,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:39:40,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=193666.66666666666, ans=0.125 2023-09-29 00:39:42,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:42,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:39:45,689 INFO [train.py:1039] (2/4) Epoch 6, batch 2500, loss[loss=0.2093, simple_loss=0.2767, pruned_loss=0.07089, over 24582.00 frames. ], tot_loss[loss=0.2335, simple_loss=0.296, pruned_loss=0.08557, over 4719081.33 frames. ], batch size: 60, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:39:46,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 00:39:47,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:39:54,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:39:55,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=193733.33333333334, ans=0.125 2023-09-29 00:39:58,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=193733.33333333334, ans=0.125 2023-09-29 00:40:04,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:40:05,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:40:07,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:40:07,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 00:40:12,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:40:14,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:15,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:40:15,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 00:40:15,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 00:40:17,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:18,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:18,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 00:40:20,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:20,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 00:40:21,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:25,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:40:26,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:30,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:40:30,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 00:40:32,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:40:33,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:37,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:42,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:45,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:40:49,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:40:52,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 00:40:52,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:52,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:40:54,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=194000.0, ans=0.125 2023-09-29 00:40:55,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:40:55,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:40:56,671 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 00:40:56,672 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 00:40:56,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 00:41:00,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:41:00,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=194000.0, ans=0.0 2023-09-29 00:41:03,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 00:41:03,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 00:41:04,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:41:04,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 00:41:07,741 INFO [train.py:1039] (2/4) Epoch 6, batch 2550, loss[loss=0.232, simple_loss=0.3112, pruned_loss=0.07638, over 24466.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.2972, pruned_loss=0.08635, over 4714598.87 frames. ], batch size: 69, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:41:09,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 00:41:10,471 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.54 vs. limit=15.0 2023-09-29 00:41:11,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:13,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:41:13,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=194066.66666666666, ans=0.2 2023-09-29 00:41:14,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:41:16,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:16,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 00:41:18,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:41:18,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=194066.66666666666, ans=0.05 2023-09-29 00:41:21,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 00:41:21,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=194066.66666666666, ans=0.0 2023-09-29 00:41:23,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:41:25,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:26,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=194133.33333333334, ans=0.125 2023-09-29 00:41:27,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:41:27,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 00:41:27,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:41:27,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:29,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:41:30,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:41:31,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 00:41:32,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:41:32,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:32,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 00:41:35,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=194133.33333333334, ans=0.125 2023-09-29 00:41:44,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:41:51,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:41:51,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:51,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:53,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:42:00,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:42:02,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:42:02,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:42:02,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:42:04,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:42:04,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:42:09,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:09,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:10,625 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.103e+02 2.352e+02 2.955e+02 4.902e+02, threshold=4.704e+02, percent-clipped=0.0 2023-09-29 00:42:14,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:42:14,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 00:42:14,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:42:15,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:17,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:42:18,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:42:18,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:19,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=194333.33333333334, ans=0.2 2023-09-29 00:42:26,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:42:27,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:30,613 INFO [train.py:1039] (2/4) Epoch 6, batch 2600, loss[loss=0.2421, simple_loss=0.3116, pruned_loss=0.08627, over 24324.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.2974, pruned_loss=0.08663, over 4712655.20 frames. ], batch size: 77, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:42:30,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=194400.0, ans=0.125 2023-09-29 00:42:32,239 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 00:42:35,192 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 00:42:35,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:42:35,283 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 00:42:35,963 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.77 vs. limit=6.0 2023-09-29 00:42:37,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 00:42:37,411 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 00:42:39,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:39,275 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 00:42:41,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 00:42:42,818 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 00:42:45,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:42:46,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=194466.66666666666, ans=0.125 2023-09-29 00:42:47,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 00:42:47,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 00:42:48,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:42:48,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 00:42:52,081 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 00:42:53,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 00:43:00,033 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:43:03,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:03,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:05,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:05,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 00:43:06,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:43:12,039 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 00:43:12,439 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:43:18,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:20,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:20,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 00:43:20,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:20,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:20,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=194600.0, ans=0.0 2023-09-29 00:43:21,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 00:43:25,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:43:25,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:43:27,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:31,052 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 00:43:32,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:32,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:43:40,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:41,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:43:41,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 00:43:41,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:42,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=194666.66666666666, ans=0.125 2023-09-29 00:43:44,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:43:44,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:43:52,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 00:43:52,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:53,558 INFO [train.py:1039] (2/4) Epoch 6, batch 2650, loss[loss=0.2165, simple_loss=0.2853, pruned_loss=0.07382, over 24578.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.2995, pruned_loss=0.08763, over 4702189.66 frames. ], batch size: 60, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:43:53,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:43:58,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 00:43:58,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:59,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:44:01,193 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 00:44:01,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:04,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:04,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.39 vs. limit=15.0 2023-09-29 00:44:08,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:44:10,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:44:12,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:44:12,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=194800.0, ans=0.125 2023-09-29 00:44:14,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 00:44:14,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:44:15,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:44:17,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 00:44:18,606 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 00:44:21,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:21,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 00:44:23,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:24,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 00:44:27,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:44:29,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:34,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 00:44:34,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 00:44:36,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:44:39,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 00:44:39,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:41,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:41,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:44:43,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:43,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:46,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:49,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:49,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:49,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:44:51,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:44:52,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:54,157 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.176e+02 2.610e+02 3.276e+02 6.463e+02, threshold=5.220e+02, percent-clipped=8.0 2023-09-29 00:44:54,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:44:54,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:55,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=194933.33333333334, ans=15.0 2023-09-29 00:44:56,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:56,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:45:03,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:03,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:45:03,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:03,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 00:45:06,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:08,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:09,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:11,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:12,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:45:12,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:14,219 INFO [train.py:1039] (2/4) Epoch 6, batch 2700, loss[loss=0.2583, simple_loss=0.307, pruned_loss=0.1049, over 23411.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.2996, pruned_loss=0.08704, over 4704010.74 frames. ], batch size: 285, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:45:15,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:15,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 00:45:18,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:45:21,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 00:45:22,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:45:22,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:22,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:24,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:45:24,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:24,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:45:24,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:45:24,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 00:45:24,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:45:26,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:45:28,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:45:29,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:32,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.39 vs. limit=22.5 2023-09-29 00:45:33,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:45:33,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=195133.33333333334, ans=0.0 2023-09-29 00:45:34,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 00:45:35,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.40 vs. limit=15.0 2023-09-29 00:45:35,500 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-09-29 00:45:36,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:45:42,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:45:42,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:45:48,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:45:48,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:48,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:45:48,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:45:53,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:45:56,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=195200.0, ans=0.125 2023-09-29 00:45:57,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:57,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:45:57,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:46:01,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:01,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:46:10,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:46:12,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:12,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.30 vs. limit=10.0 2023-09-29 00:46:15,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:46:15,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:18,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:19,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:19,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:46:21,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:22,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:23,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:26,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:46:26,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:26,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:29,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 00:46:30,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:33,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:46:33,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 00:46:36,470 INFO [train.py:1039] (2/4) Epoch 6, batch 2750, loss[loss=0.235, simple_loss=0.3103, pruned_loss=0.07986, over 24689.00 frames. ], tot_loss[loss=0.2376, simple_loss=0.2993, pruned_loss=0.08794, over 4684909.00 frames. ], batch size: 73, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:46:37,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 00:46:37,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:40,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:46:40,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:41,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:41,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:46:42,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:45,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:46:45,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:46:46,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:46:46,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:46,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 00:46:46,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:46:46,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:54,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 00:46:55,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:55,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:55,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:57,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:46:59,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:59,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:47:01,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:01,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:05,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:47:05,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:47:07,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:47:08,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:10,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:47:11,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=195533.33333333334, ans=0.0 2023-09-29 00:47:18,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:20,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:47:20,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:26,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:26,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:47:26,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:47:33,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:47:33,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:47:33,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 00:47:38,283 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 2.212e+02 2.511e+02 3.083e+02 4.520e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-29 00:47:39,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:41,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 00:47:48,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:47:50,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:47:50,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 00:47:51,407 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.71 vs. limit=10.0 2023-09-29 00:47:52,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:47:53,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:47:53,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 00:47:53,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:47:56,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:47:58,246 INFO [train.py:1039] (2/4) Epoch 6, batch 2800, loss[loss=0.2328, simple_loss=0.2985, pruned_loss=0.08354, over 23389.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2983, pruned_loss=0.08718, over 4702190.77 frames. ], batch size: 93, lr: 1.70e-02, grad_scale: 32.0 2023-09-29 00:47:58,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:47:58,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:47:59,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 00:47:59,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:00,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:03,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:03,177 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 00:48:03,178 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 00:48:07,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:09,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:48:09,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:48:14,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:48:15,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 00:48:19,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:48:20,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 00:48:21,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:22,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:48:22,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:24,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:26,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:26,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:48:27,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:48:35,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:48:37,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:38,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:40,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:48:42,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:47,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:48:47,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 00:48:49,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:49,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:49,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:48:56,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:58,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:59,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:49:01,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:49:02,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:02,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:49:02,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:49:04,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:49:06,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:49:06,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 00:49:06,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:07,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:49:07,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:08,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=196000.0, ans=0.2 2023-09-29 00:49:09,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 00:49:10,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:10,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:49:12,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:49:13,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 00:49:19,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:49:19,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:49:19,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:49:21,322 INFO [train.py:1039] (2/4) Epoch 6, batch 2850, loss[loss=0.2336, simple_loss=0.3012, pruned_loss=0.08299, over 24423.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2986, pruned_loss=0.08711, over 4710756.57 frames. ], batch size: 77, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:49:23,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:26,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:49:26,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:49:26,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:49:30,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:30,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:32,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:49:33,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 00:49:39,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 00:49:39,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:41,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 00:49:41,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:44,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 00:49:44,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 00:49:47,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:59,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:59,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:50:01,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:50:01,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:50:01,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:50:01,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:50:02,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:50:02,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 00:50:07,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:50:07,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:07,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:08,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:12,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:15,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:50:15,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:50:17,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:17,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=196266.66666666666, ans=0.1 2023-09-29 00:50:18,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:21,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:50:21,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=196266.66666666666, ans=0.2 2023-09-29 00:50:24,246 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.055e+02 2.329e+02 2.690e+02 4.548e+02, threshold=4.658e+02, percent-clipped=0.0 2023-09-29 00:50:27,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:50:31,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 00:50:31,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 00:50:32,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:50:32,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:32,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 00:50:33,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=196333.33333333334, ans=0.04949747468305833 2023-09-29 00:50:34,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:50:34,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:34,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:34,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:50:34,553 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 00:50:34,623 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 00:50:34,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:36,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:42,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:50:42,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:42,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:43,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=196400.0, ans=0.125 2023-09-29 00:50:44,193 INFO [train.py:1039] (2/4) Epoch 6, batch 2900, loss[loss=0.2002, simple_loss=0.2734, pruned_loss=0.06352, over 24325.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.2988, pruned_loss=0.0871, over 4710508.08 frames. ], batch size: 61, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:50:44,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 00:50:47,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:47,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 00:50:47,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 00:50:48,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-09-29 00:50:50,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:50:50,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:50:52,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:55,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:56,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=196400.0, ans=0.125 2023-09-29 00:50:59,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:59,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:51:02,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:51:02,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 00:51:04,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:51:04,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=196466.66666666666, ans=0.125 2023-09-29 00:51:06,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:09,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 00:51:10,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 00:51:14,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:51:14,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 00:51:14,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:51:17,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:51:17,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:51:20,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:51:20,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:23,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:51:25,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:27,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 00:51:27,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 00:51:27,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:51:32,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:51:36,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 00:51:36,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:51:37,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=196600.0, ans=0.1 2023-09-29 00:51:41,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=196600.0, ans=0.1 2023-09-29 00:51:43,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:47,130 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.15 vs. limit=15.0 2023-09-29 00:51:52,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:51:52,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:51:54,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 00:51:57,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:57,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 00:51:58,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:00,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:52:04,444 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.19 vs. limit=6.0 2023-09-29 00:52:05,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:07,276 INFO [train.py:1039] (2/4) Epoch 6, batch 2950, loss[loss=0.2591, simple_loss=0.3106, pruned_loss=0.1038, over 23508.00 frames. ], tot_loss[loss=0.2391, simple_loss=0.3006, pruned_loss=0.08875, over 4700363.16 frames. ], batch size: 134, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:52:07,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 00:52:07,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:07,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:11,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:12,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:52:14,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 00:52:14,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 00:52:14,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:52:14,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:21,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:23,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:24,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:52:24,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:28,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:52:29,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:52:30,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:52:33,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 00:52:40,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 00:52:41,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.90 vs. limit=15.0 2023-09-29 00:52:42,917 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 00:52:43,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:52:45,975 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 00:52:46,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 00:52:46,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:47,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:47,553 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 00:52:47,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:52:49,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 00:52:51,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:51,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:52:54,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:56,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:52:56,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:52:58,045 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 00:52:58,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:59,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 00:53:05,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:07,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:07,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 00:53:07,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:53:09,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.19 vs. limit=6.0 2023-09-29 00:53:10,061 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.213e+02 2.464e+02 2.740e+02 4.622e+02, threshold=4.928e+02, percent-clipped=0.0 2023-09-29 00:53:10,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 00:53:11,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:15,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:53:15,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:53:15,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:15,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:53:17,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:53:17,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.76 vs. limit=15.0 2023-09-29 00:53:19,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:19,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:53:19,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:53:20,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:20,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:53:22,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:22,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 00:53:22,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=197000.0, ans=0.2 2023-09-29 00:53:25,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:26,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:53:27,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:53:30,686 INFO [train.py:1039] (2/4) Epoch 6, batch 3000, loss[loss=0.2447, simple_loss=0.3188, pruned_loss=0.0853, over 24455.00 frames. ], tot_loss[loss=0.2389, simple_loss=0.3008, pruned_loss=0.08847, over 4720438.32 frames. ], batch size: 69, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:53:30,687 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 00:53:45,520 INFO [train.py:1071] (2/4) Epoch 6, validation: loss=0.3825, simple_loss=0.3275, pruned_loss=0.2187, over 1125622.00 frames. 2023-09-29 00:53:45,521 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 00:53:46,518 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.48 vs. limit=22.5 2023-09-29 00:53:47,206 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 00:53:47,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 00:53:50,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:50,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:53:51,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 00:53:51,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:54:00,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:54:07,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=197133.33333333334, ans=0.0 2023-09-29 00:54:08,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:54:11,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-09-29 00:54:14,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 00:54:17,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:54:19,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:54:19,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:54:21,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:54:23,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:23,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 00:54:26,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 00:54:28,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:54:28,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:54:30,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:54:30,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:32,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:32,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:54:37,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:54:37,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:37,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:54:39,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:42,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 00:54:42,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:54:42,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:54:44,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:54:48,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:48,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:50,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:54:50,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 00:54:50,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:54:50,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 00:54:50,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:54:53,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 00:54:57,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:54:57,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 00:54:57,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 00:54:59,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 00:54:59,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:55:00,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:55:02,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:55:02,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:55:02,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:03,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:55:07,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 00:55:09,407 INFO [train.py:1039] (2/4) Epoch 6, batch 3050, loss[loss=0.2386, simple_loss=0.2907, pruned_loss=0.0932, over 23427.00 frames. ], tot_loss[loss=0.2399, simple_loss=0.3017, pruned_loss=0.08908, over 4709317.92 frames. ], batch size: 285, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:55:09,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:12,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:12,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:55:17,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:20,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 00:55:25,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=197466.66666666666, ans=0.125 2023-09-29 00:55:26,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 00:55:28,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 00:55:28,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:31,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:55:34,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:34,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:36,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:40,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:55:41,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:55:42,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:42,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:42,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:43,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:44,294 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.48 vs. limit=10.0 2023-09-29 00:55:47,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:47,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=197533.33333333334, ans=0.1 2023-09-29 00:55:48,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:49,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 00:55:50,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:50,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:55:53,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:55,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:55:56,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:55:56,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:02,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:56:04,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:10,218 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.122e+02 2.325e+02 2.738e+02 3.532e+02, threshold=4.649e+02, percent-clipped=0.0 2023-09-29 00:56:10,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:10,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:56:10,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:56:12,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:12,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:56:14,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:56:14,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 00:56:14,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=197666.66666666666, ans=10.0 2023-09-29 00:56:17,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:17,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:19,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 00:56:23,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:29,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:31,031 INFO [train.py:1039] (2/4) Epoch 6, batch 3100, loss[loss=0.2134, simple_loss=0.2869, pruned_loss=0.06989, over 24505.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.3007, pruned_loss=0.08814, over 4721692.67 frames. ], batch size: 63, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:56:31,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:56:34,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:56:35,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 00:56:37,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 00:56:40,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 00:56:43,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:56:44,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:56:44,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:45,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=197800.0, ans=0.0 2023-09-29 00:56:48,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:56:52,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:00,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 00:57:04,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:57:04,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:05,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:07,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:07,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 00:57:09,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:57:10,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 00:57:10,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:57:11,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:13,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 00:57:13,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:57:17,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:57:17,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 00:57:20,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 00:57:20,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:20,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:25,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:25,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:25,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:57:26,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:57:26,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:57:27,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=197933.33333333334, ans=0.2 2023-09-29 00:57:29,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:57:30,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:57:30,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:30,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 00:57:34,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:36,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 00:57:39,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:57:39,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 00:57:39,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:41,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:41,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 00:57:41,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=198000.0, ans=0.125 2023-09-29 00:57:49,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 00:57:51,983 INFO [train.py:1039] (2/4) Epoch 6, batch 3150, loss[loss=0.2411, simple_loss=0.2891, pruned_loss=0.09657, over 23775.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2987, pruned_loss=0.0872, over 4715922.06 frames. ], batch size: 179, lr: 1.69e-02, grad_scale: 16.0 2023-09-29 00:57:52,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:57:54,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:55,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:55,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:57:56,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 00:57:59,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:00,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:58:02,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 00:58:03,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:05,747 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 00:58:09,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 00:58:09,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:58:10,971 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 00:58:11,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:58:11,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=198133.33333333334, ans=0.0 2023-09-29 00:58:12,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 00:58:13,356 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.18 vs. limit=15.0 2023-09-29 00:58:14,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 00:58:14,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 00:58:14,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:14,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:15,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:17,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 00:58:20,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:23,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:58:26,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 00:58:28,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:58:28,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=198200.0, ans=0.125 2023-09-29 00:58:31,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:58:31,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:33,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 00:58:33,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=198200.0, ans=0.0 2023-09-29 00:58:35,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 00:58:36,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:58:36,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:58:37,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:58:37,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:58:37,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:58:39,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:58:39,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:58:40,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 00:58:42,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:58:42,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:43,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:58:43,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:45,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 00:58:46,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:48,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 00:58:48,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:48,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 00:58:49,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 00:58:52,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:58:52,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:54,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 00:58:54,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=198266.66666666666, ans=0.125 2023-09-29 00:58:55,606 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.072e+02 2.389e+02 2.889e+02 3.902e+02, threshold=4.779e+02, percent-clipped=0.0 2023-09-29 00:58:55,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:58:55,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:59:00,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:59:01,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:01,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:59:07,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:59:07,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:12,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:59:14,263 INFO [train.py:1039] (2/4) Epoch 6, batch 3200, loss[loss=0.2203, simple_loss=0.2781, pruned_loss=0.08124, over 23861.00 frames. ], tot_loss[loss=0.2358, simple_loss=0.2977, pruned_loss=0.08692, over 4711225.42 frames. ], batch size: 164, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 00:59:18,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:59:18,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:59:21,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:23,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:59:23,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 00:59:24,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:59:29,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:59:29,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=198466.66666666666, ans=0.0 2023-09-29 00:59:31,059 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:59:35,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:44,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:59:55,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 00:59:56,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:00:00,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 01:00:00,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:00:04,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:00:04,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:00:05,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:00:07,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=198600.0, ans=0.0 2023-09-29 01:00:09,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 01:00:10,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 01:00:10,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=198600.0, ans=0.09899494936611666 2023-09-29 01:00:12,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 01:00:17,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 01:00:18,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=198666.66666666666, ans=0.0 2023-09-29 01:00:19,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:00:23,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=198666.66666666666, ans=0.125 2023-09-29 01:00:26,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:00:26,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,761 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 01:00:26,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:00:31,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:00:33,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 01:00:34,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 01:00:36,062 INFO [train.py:1039] (2/4) Epoch 6, batch 3250, loss[loss=0.2029, simple_loss=0.2789, pruned_loss=0.06341, over 24649.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.298, pruned_loss=0.08664, over 4723222.40 frames. ], batch size: 60, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:00:36,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 01:00:37,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 01:00:39,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:00:42,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:00:42,475 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 01:00:42,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:00:42,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:00:42,689 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 01:00:47,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=198733.33333333334, ans=0.125 2023-09-29 01:00:48,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:00:51,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:01:00,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:00,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 01:01:02,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:02,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:02,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:05,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:05,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:01:08,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:08,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:01:08,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:09,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:09,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:09,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:13,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:16,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:17,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:17,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:19,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:19,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:19,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:23,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 01:01:24,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:01:24,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:01:26,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:27,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:01:28,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-09-29 01:01:32,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=198933.33333333334, ans=0.1 2023-09-29 01:01:34,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:01:41,424 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.144e+02 2.444e+02 2.943e+02 3.918e+02, threshold=4.889e+02, percent-clipped=0.0 2023-09-29 01:01:41,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:01:42,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:42,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 01:01:43,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:01:43,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:01:44,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:46,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.72 vs. limit=22.5 2023-09-29 01:01:47,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 01:01:47,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 01:01:47,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:47,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=199000.0, ans=0.125 2023-09-29 01:01:49,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:50,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:52,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:01:52,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:53,233 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.25 vs. limit=15.0 2023-09-29 01:01:56,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:56,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:57,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 01:01:57,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:01:59,120 INFO [train.py:1039] (2/4) Epoch 6, batch 3300, loss[loss=0.2121, simple_loss=0.2753, pruned_loss=0.07447, over 23337.00 frames. ], tot_loss[loss=0.2358, simple_loss=0.2984, pruned_loss=0.08657, over 4736558.14 frames. ], batch size: 120, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:02:00,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:02:00,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 01:02:02,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:02:03,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.64 vs. limit=15.0 2023-09-29 01:02:04,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 01:02:07,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 01:02:07,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 01:02:07,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:12,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:02:14,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:02:14,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:17,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:02:17,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:02:19,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:20,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:02:25,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 01:02:25,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:25,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:27,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:28,763 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 01:02:30,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:02:31,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:02:31,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:02:31,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:02:31,913 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 01:02:36,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:36,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:02:38,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:40,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 01:02:41,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:02:41,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:41,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:02:43,774 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 01:02:47,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 01:02:47,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:02:49,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=199266.66666666666, ans=0.125 2023-09-29 01:02:50,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 01:02:51,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:02:55,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:02:56,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:02:58,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:58,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:58,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:58,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:03:02,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:03:02,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:03,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:03:04,997 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 01:03:06,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 01:03:09,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:03:09,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:09,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:13,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:03:13,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:14,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:03:14,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:14,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:03:15,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=199333.33333333334, ans=0.125 2023-09-29 01:03:16,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:17,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:03:21,888 INFO [train.py:1039] (2/4) Epoch 6, batch 3350, loss[loss=0.2605, simple_loss=0.3074, pruned_loss=0.1068, over 23542.00 frames. ], tot_loss[loss=0.2367, simple_loss=0.2988, pruned_loss=0.08724, over 4735383.29 frames. ], batch size: 256, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:03:21,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 01:03:22,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:23,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:25,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:03:25,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:03:27,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:29,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:29,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:32,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:03:34,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:34,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:03:37,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:39,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=199466.66666666666, ans=0.125 2023-09-29 01:03:40,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:03:40,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:42,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:03:43,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 01:03:45,992 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 01:03:46,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:49,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 01:03:49,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 01:03:50,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:03:50,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:03:52,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:52,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 01:03:52,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:52,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:03:55,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:57,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:57,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:00,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:04:03,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:05,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:06,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:08,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=199600.0, ans=0.2 2023-09-29 01:04:09,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:04:11,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:13,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:13,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:16,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:18,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 01:04:18,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:04:18,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 01:04:19,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:04:19,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 01:04:20,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=199600.0, ans=0.125 2023-09-29 01:04:21,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:23,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:25,672 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.082e+02 2.289e+02 2.624e+02 4.671e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 01:04:31,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:32,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 01:04:33,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:04:34,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:04:34,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:04:35,046 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:04:39,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:04:42,779 INFO [train.py:1039] (2/4) Epoch 6, batch 3400, loss[loss=0.2678, simple_loss=0.3161, pruned_loss=0.1098, over 23407.00 frames. ], tot_loss[loss=0.2375, simple_loss=0.2996, pruned_loss=0.08772, over 4731467.93 frames. ], batch size: 285, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:04:42,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 01:04:42,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:04:43,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:04:45,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:45,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 01:04:47,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:47,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 01:04:48,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:48,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:49,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:04:49,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=199733.33333333334, ans=0.0 2023-09-29 01:04:51,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:04:51,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 01:04:54,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=199733.33333333334, ans=0.125 2023-09-29 01:04:55,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 01:04:55,725 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 01:04:55,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:00,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:05:00,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:05:00,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:02,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:05:03,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=199800.0, ans=0.125 2023-09-29 01:05:07,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:09,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 01:05:13,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:05:14,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.81 vs. limit=15.0 2023-09-29 01:05:16,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:17,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:17,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:05:24,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:05:28,163 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.25 vs. limit=10.0 2023-09-29 01:05:29,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 01:05:32,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=199933.33333333334, ans=0.125 2023-09-29 01:05:35,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:37,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:38,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 01:05:38,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:05:39,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:39,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:41,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:05:44,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:48,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:05:48,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:05:56,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:05:58,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 01:06:02,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:06:06,476 INFO [train.py:1039] (2/4) Epoch 6, batch 3450, loss[loss=0.2789, simple_loss=0.3156, pruned_loss=0.1211, over 19824.00 frames. ], tot_loss[loss=0.2362, simple_loss=0.2985, pruned_loss=0.08697, over 4737590.79 frames. ], batch size: 388, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:06:06,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 01:06:09,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 01:06:11,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:13,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:06:13,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 01:06:14,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:06:17,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:06:22,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=200133.33333333334, ans=0.1 2023-09-29 01:06:23,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:06:24,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:26,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:06:26,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:28,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:34,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 01:06:36,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=200133.33333333334, ans=0.125 2023-09-29 01:06:40,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 01:06:40,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:06:40,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:06:42,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:48,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 01:06:50,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:06:54,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:06:54,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:56,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:06:58,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:06:59,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.52 vs. limit=6.0 2023-09-29 01:06:59,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 01:06:59,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:07:01,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=200266.66666666666, ans=10.0 2023-09-29 01:07:03,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:07:04,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:06,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 01:07:11,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:07:12,878 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.101e+02 2.630e+02 3.255e+02 5.395e+02, threshold=5.260e+02, percent-clipped=4.0 2023-09-29 01:07:14,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:07:16,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:19,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:23,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:23,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:07:25,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:07:25,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:07:30,104 INFO [train.py:1039] (2/4) Epoch 6, batch 3500, loss[loss=0.2226, simple_loss=0.2548, pruned_loss=0.09521, over 19244.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.2971, pruned_loss=0.08674, over 4725662.30 frames. ], batch size: 389, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:07:30,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:35,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:07:35,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 01:07:36,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:07:39,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:07:41,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:41,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 01:07:44,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:07:47,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:48,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=200466.66666666666, ans=0.2 2023-09-29 01:07:49,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:07:49,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:07:49,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:07:49,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=200466.66666666666, ans=0.125 2023-09-29 01:07:51,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:51,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:07:51,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 01:07:55,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:55,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:07:58,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:01,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:03,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 01:08:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:08:07,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:07,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:08:08,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:09,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=200533.33333333334, ans=0.1 2023-09-29 01:08:10,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:08:10,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:12,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 01:08:12,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 01:08:13,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 01:08:13,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:15,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:16,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:16,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:08:19,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:08:21,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:08:21,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=200600.0, ans=0.125 2023-09-29 01:08:27,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:08:28,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 01:08:28,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 01:08:28,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:08:31,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:33,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:34,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=200600.0, ans=0.1 2023-09-29 01:08:35,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:35,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=200666.66666666666, ans=0.0 2023-09-29 01:08:39,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 01:08:40,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:42,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:43,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 01:08:45,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 01:08:45,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=200666.66666666666, ans=10.0 2023-09-29 01:08:47,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:48,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:48,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:08:48,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:08:48,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=200666.66666666666, ans=0.2 2023-09-29 01:08:52,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=200733.33333333334, ans=0.1 2023-09-29 01:08:53,068 INFO [train.py:1039] (2/4) Epoch 6, batch 3550, loss[loss=0.2656, simple_loss=0.3107, pruned_loss=0.1103, over 23848.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.296, pruned_loss=0.08611, over 4726911.67 frames. ], batch size: 179, lr: 1.68e-02, grad_scale: 8.0 2023-09-29 01:08:53,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:08:56,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=200733.33333333334, ans=0.125 2023-09-29 01:09:02,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:03,561 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.26 vs. limit=15.0 2023-09-29 01:09:05,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 01:09:06,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=200733.33333333334, ans=0.125 2023-09-29 01:09:07,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:09,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:09:13,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:14,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:09:14,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:09:16,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:16,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:09:17,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:17,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:09:19,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:09:21,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=200800.0, ans=0.0 2023-09-29 01:09:24,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:09:24,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:27,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:27,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:28,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:09:28,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 01:09:28,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:30,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:31,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:09:37,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:37,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:39,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:40,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 01:09:40,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=200933.33333333334, ans=0.5 2023-09-29 01:09:42,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:09:42,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 01:09:44,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:45,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:09:46,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:09:50,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 01:09:51,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:09:59,416 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.123e+02 2.427e+02 3.037e+02 5.186e+02, threshold=4.854e+02, percent-clipped=0.0 2023-09-29 01:09:59,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:10:01,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 01:10:01,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:04,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:10:04,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 01:10:11,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 01:10:12,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:10:12,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:10:14,579 INFO [train.py:1039] (2/4) Epoch 6, batch 3600, loss[loss=0.2536, simple_loss=0.3064, pruned_loss=0.1004, over 23793.00 frames. ], tot_loss[loss=0.2336, simple_loss=0.2963, pruned_loss=0.08543, over 4734777.60 frames. ], batch size: 212, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:10:16,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:16,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:18,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:10:23,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:25,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:25,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:10:25,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:10:26,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:26,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 01:10:31,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:10:32,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:37,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:37,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:39,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:10:40,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:40,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 01:10:42,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:45,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:47,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:10:49,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:10:51,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:51,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:10:53,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 01:10:57,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=201200.0, ans=0.0 2023-09-29 01:11:02,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:03,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:11:03,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 01:11:08,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:11:14,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:16,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:24,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:11:25,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:11:25,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 01:11:27,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 01:11:27,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 01:11:30,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:11:32,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:11:34,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 01:11:34,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:11:34,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=201333.33333333334, ans=0.0 2023-09-29 01:11:35,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:11:35,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:35,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 01:11:35,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 01:11:38,733 INFO [train.py:1039] (2/4) Epoch 6, batch 3650, loss[loss=0.2448, simple_loss=0.3042, pruned_loss=0.09264, over 23454.00 frames. ], tot_loss[loss=0.2346, simple_loss=0.2971, pruned_loss=0.08605, over 4733922.02 frames. ], batch size: 120, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:11:40,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:41,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 01:11:43,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=201400.0, ans=0.125 2023-09-29 01:11:46,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 01:11:46,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:11:51,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 01:11:51,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 01:11:55,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=201466.66666666666, ans=0.125 2023-09-29 01:11:56,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:11:56,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:11:58,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:12:01,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:12:01,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:12:01,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=201466.66666666666, ans=0.125 2023-09-29 01:12:03,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 01:12:03,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:12:03,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:05,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 01:12:05,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:12:07,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:07,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:09,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:12:12,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 01:12:12,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 01:12:14,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:12:15,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 01:12:17,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:17,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:12:17,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=201533.33333333334, ans=0.0 2023-09-29 01:12:21,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:12:25,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:25,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:12:25,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=201533.33333333334, ans=0.0 2023-09-29 01:12:26,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:12:27,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:12:30,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:12:32,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:33,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:33,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:35,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:12:36,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:36,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:44,494 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 01:12:47,280 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.133e+02 2.417e+02 2.802e+02 4.868e+02, threshold=4.835e+02, percent-clipped=1.0 2023-09-29 01:12:50,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:50,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:51,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:12:51,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:51,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:12:52,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=201666.66666666666, ans=0.125 2023-09-29 01:12:53,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:55,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 01:12:55,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:58,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:12:59,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:13:01,760 INFO [train.py:1039] (2/4) Epoch 6, batch 3700, loss[loss=0.2245, simple_loss=0.2868, pruned_loss=0.08115, over 18595.00 frames. ], tot_loss[loss=0.2357, simple_loss=0.2977, pruned_loss=0.08681, over 4710338.29 frames. ], batch size: 40, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:13:01,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:13:04,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:04,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 01:13:04,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:13:05,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:13:07,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:13:10,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:13:12,328 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.67 vs. limit=22.5 2023-09-29 01:13:13,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:13,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:15,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:13:17,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:17,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:13:17,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:19,246 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 01:13:19,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=201800.0, ans=0.125 2023-09-29 01:13:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:13:28,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:13:29,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:13:29,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 01:13:29,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:35,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:36,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 01:13:38,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:39,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:13:42,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:42,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:13:44,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:13:48,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:48,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 01:13:48,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:48,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 01:13:53,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:13:54,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:13:56,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=201933.33333333334, ans=10.0 2023-09-29 01:13:59,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:59,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 01:14:02,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:14:02,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:14:02,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:02,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:07,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:07,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 01:14:09,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 01:14:09,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:14:10,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:11,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=202000.0, ans=0.125 2023-09-29 01:14:12,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:14:14,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:14:15,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:14:17,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:14:18,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:14:20,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 01:14:23,837 INFO [train.py:1039] (2/4) Epoch 6, batch 3750, loss[loss=0.2382, simple_loss=0.2854, pruned_loss=0.09554, over 23782.00 frames. ], tot_loss[loss=0.2397, simple_loss=0.301, pruned_loss=0.08913, over 4700699.24 frames. ], batch size: 232, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:14:23,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:14:25,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:14:27,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 01:14:27,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:14:29,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:30,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:30,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:14:35,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:38,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:14:40,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:14:44,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:48,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:14:50,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 01:14:50,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:14:51,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:14:53,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:55,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 01:15:00,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 01:15:01,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:15:01,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:15:03,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:08,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:10,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:15:16,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 01:15:17,491 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.23 vs. limit=12.0 2023-09-29 01:15:19,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:22,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:15:22,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:15:26,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:15:26,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=202266.66666666666, ans=10.0 2023-09-29 01:15:30,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:15:31,939 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.301e+02 2.601e+02 3.130e+02 4.781e+02, threshold=5.202e+02, percent-clipped=0.0 2023-09-29 01:15:32,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:15:35,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:15:36,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:15:39,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:15:45,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=202400.0, ans=0.125 2023-09-29 01:15:46,255 INFO [train.py:1039] (2/4) Epoch 6, batch 3800, loss[loss=0.2393, simple_loss=0.3139, pruned_loss=0.08236, over 24682.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.3003, pruned_loss=0.08807, over 4702481.04 frames. ], batch size: 73, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:15:48,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:15:53,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:55,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:15:55,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 01:15:58,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:58,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:00,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:16:01,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:16:01,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:03,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:16:04,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:16:05,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:16:05,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:07,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 01:16:10,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 01:16:10,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:16:13,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:15,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:16:17,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:16:19,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:16:19,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:20,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:22,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:24,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=202533.33333333334, ans=0.05 2023-09-29 01:16:27,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=202533.33333333334, ans=0.0 2023-09-29 01:16:28,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:16:28,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 01:16:30,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:38,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:16:42,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.88 vs. limit=22.5 2023-09-29 01:16:43,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:16:46,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 01:16:48,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 01:16:48,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:51,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:51,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:55,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 01:16:57,385 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:16:59,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 01:17:00,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 01:17:00,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:02,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:17:07,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:17:08,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:17:10,312 INFO [train.py:1039] (2/4) Epoch 6, batch 3850, loss[loss=0.2296, simple_loss=0.2962, pruned_loss=0.0815, over 24640.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2984, pruned_loss=0.08743, over 4685296.84 frames. ], batch size: 65, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:17:14,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:17:15,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 01:17:18,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:17:18,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:23,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:17:24,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:27,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:17:27,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 01:17:35,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:37,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:40,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:40,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:17:43,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:43,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:17:44,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:44,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:17:46,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:49,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:51,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:51,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:17:51,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 01:17:51,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 01:17:51,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:51,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:55,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:17:55,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:57,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 01:17:58,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 01:18:00,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:02,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 01:18:02,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=202933.33333333334, ans=0.0 2023-09-29 01:18:05,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:18:09,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:11,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:18:17,503 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.249e+02 2.602e+02 3.151e+02 5.214e+02, threshold=5.203e+02, percent-clipped=1.0 2023-09-29 01:18:17,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:17,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 01:18:19,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 01:18:23,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:23,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:26,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:18:26,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:18:27,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:18:29,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 01:18:29,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:18:30,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 01:18:30,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:30,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:31,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=203066.66666666666, ans=0.125 2023-09-29 01:18:32,221 INFO [train.py:1039] (2/4) Epoch 6, batch 3900, loss[loss=0.2361, simple_loss=0.2906, pruned_loss=0.09078, over 23845.00 frames. ], tot_loss[loss=0.2347, simple_loss=0.2965, pruned_loss=0.08643, over 4699383.68 frames. ], batch size: 195, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:18:32,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=203066.66666666666, ans=0.035 2023-09-29 01:18:33,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:18:33,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:36,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:18:36,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:36,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:38,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:18:38,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 01:18:38,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:44,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:44,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:44,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:18:44,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=203066.66666666666, ans=0.125 2023-09-29 01:18:46,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:49,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:49,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:51,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:18:52,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 01:18:52,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:18:54,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 01:18:54,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:55,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 01:18:58,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 01:18:58,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=203133.33333333334, ans=0.125 2023-09-29 01:18:59,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=203133.33333333334, ans=0.0 2023-09-29 01:19:02,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:02,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:19:04,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:19:04,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:08,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:11,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:19:13,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:19:13,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:13,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:19:21,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:19:21,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:19:30,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:19:33,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:19:35,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=203266.66666666666, ans=0.125 2023-09-29 01:19:38,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=203333.33333333334, ans=0.125 2023-09-29 01:19:41,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:19:44,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:44,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 01:19:46,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 01:19:46,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:46,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 01:19:50,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:50,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 01:19:55,234 INFO [train.py:1039] (2/4) Epoch 6, batch 3950, loss[loss=0.2133, simple_loss=0.2771, pruned_loss=0.07473, over 21620.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2955, pruned_loss=0.08633, over 4678484.60 frames. ], batch size: 47, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:19:58,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:20:01,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 01:20:01,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:20:05,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:20:07,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:20:13,270 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 01:20:13,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:14,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 01:20:14,870 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 01:20:14,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:20:17,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:17,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:20:17,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:20,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 01:20:24,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:20:25,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:25,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:20:25,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:20:26,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:20:26,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=203533.33333333334, ans=0.1 2023-09-29 01:20:26,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=203533.33333333334, ans=0.125 2023-09-29 01:20:35,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=203533.33333333334, ans=0.07 2023-09-29 01:20:36,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:20:38,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:20:42,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 01:20:47,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 01:20:47,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 01:20:47,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:20:47,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=203600.0, ans=0.2 2023-09-29 01:20:49,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:20:59,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:20:59,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:21:00,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:00,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:21:00,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 01:21:02,045 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.135e+02 2.350e+02 2.654e+02 4.554e+02, threshold=4.701e+02, percent-clipped=0.0 2023-09-29 01:21:06,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:21:08,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:21:12,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 01:21:17,451 INFO [train.py:1039] (2/4) Epoch 6, batch 4000, loss[loss=0.2882, simple_loss=0.3281, pruned_loss=0.1241, over 22729.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.296, pruned_loss=0.08613, over 4689960.36 frames. ], batch size: 322, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:21:22,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:32,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:33,336 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:21:37,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:37,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:21:37,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:37,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 01:21:37,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.89 vs. limit=15.0 2023-09-29 01:21:38,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:21:40,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 01:21:40,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:21:40,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 01:21:41,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:47,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:21:47,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:21:47,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:21:47,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:47,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:21:48,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:21:52,321 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 01:21:53,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:21:53,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:21:56,955 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 01:21:57,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:21:57,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:04,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 01:22:06,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:22:07,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:22:09,293 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 01:22:10,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:22:12,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 01:22:12,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:22:13,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:13,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:22:15,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:22:15,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:22:16,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:18,731 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:22:19,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 01:22:19,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:22,203 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 01:22:27,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:22:30,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:22:31,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=204000.0, ans=0.1 2023-09-29 01:22:33,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=204000.0, ans=0.125 2023-09-29 01:22:34,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:22:35,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:35,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:22:36,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:22:39,597 INFO [train.py:1039] (2/4) Epoch 6, batch 4050, loss[loss=0.2131, simple_loss=0.284, pruned_loss=0.07112, over 24337.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.2973, pruned_loss=0.08748, over 4693808.26 frames. ], batch size: 61, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:22:45,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:46,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:22:47,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 01:22:49,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:22:49,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:22:51,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:22:52,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:22:54,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:22:58,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:22:59,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:00,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:23:00,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=204133.33333333334, ans=0.125 2023-09-29 01:23:01,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=204133.33333333334, ans=0.125 2023-09-29 01:23:02,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:23:03,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:23:08,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:09,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:23:12,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 01:23:15,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 01:23:15,153 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 01:23:16,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:23:23,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 01:23:24,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:29,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:33,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:33,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:23:33,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:34,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:38,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 01:23:38,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:23:41,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:43,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 01:23:48,276 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.126e+02 2.466e+02 2.913e+02 5.658e+02, threshold=4.933e+02, percent-clipped=1.0 2023-09-29 01:23:48,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:56,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 01:23:56,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:56,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:23:58,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=204333.33333333334, ans=0.125 2023-09-29 01:23:59,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 01:23:59,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 01:23:59,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:02,554 INFO [train.py:1039] (2/4) Epoch 6, batch 4100, loss[loss=0.233, simple_loss=0.3061, pruned_loss=0.07993, over 24597.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.2984, pruned_loss=0.08767, over 4694002.84 frames. ], batch size: 71, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:24:02,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:04,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:04,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:24:11,735 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.44 vs. limit=22.5 2023-09-29 01:24:12,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 01:24:14,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 01:24:16,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 01:24:17,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 01:24:17,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:17,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:17,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:19,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:24:19,574 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 01:24:23,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:24,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:24:24,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:24,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:24:29,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:24:31,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:31,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:24:31,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 01:24:31,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:32,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:24:32,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:24:33,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 01:24:34,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:24:36,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 01:24:37,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:24:38,756 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.31 vs. limit=6.0 2023-09-29 01:24:41,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:41,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 01:24:44,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:44,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:24:44,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:24:46,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 01:24:47,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:24:47,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=204533.33333333334, ans=0.2 2023-09-29 01:24:47,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=204533.33333333334, ans=0.0 2023-09-29 01:24:48,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:24:50,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 01:24:50,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=204533.33333333334, ans=0.1 2023-09-29 01:24:50,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=204533.33333333334, ans=0.0 2023-09-29 01:24:51,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:51,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:24:54,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:00,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:05,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:06,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:25:15,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:15,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:17,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=204666.66666666666, ans=6.0 2023-09-29 01:25:21,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:23,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=204666.66666666666, ans=0.125 2023-09-29 01:25:24,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:25:25,897 INFO [train.py:1039] (2/4) Epoch 6, batch 4150, loss[loss=0.2083, simple_loss=0.2773, pruned_loss=0.06963, over 24577.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.2984, pruned_loss=0.0873, over 4708295.82 frames. ], batch size: 60, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:25:27,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:25:29,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:25:29,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=204733.33333333334, ans=0.0 2023-09-29 01:25:30,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:25:30,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:34,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 01:25:34,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:35,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 01:25:37,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 01:25:37,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 01:25:39,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:44,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:25:44,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:48,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:25:49,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:25:50,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:25:52,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:25:52,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:54,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:25:58,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:02,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:02,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 01:26:07,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 01:26:07,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:26:07,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 01:26:07,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:26:07,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:07,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=204866.66666666666, ans=0.0 2023-09-29 01:26:10,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:11,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:17,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 01:26:17,934 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:26:21,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:23,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:26:24,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 01:26:24,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:26,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 01:26:29,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:26:29,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:31,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:33,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 01:26:33,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:26:33,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:26:33,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:26:34,518 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.193e+02 2.474e+02 2.867e+02 4.434e+02, threshold=4.949e+02, percent-clipped=0.0 2023-09-29 01:26:36,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 01:26:36,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:36,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:26:36,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:26:37,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 01:26:37,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:37,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:26:39,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:42,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:42,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 01:26:43,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:49,014 INFO [train.py:1039] (2/4) Epoch 6, batch 4200, loss[loss=0.2329, simple_loss=0.3117, pruned_loss=0.07698, over 24387.00 frames. ], tot_loss[loss=0.2348, simple_loss=0.2969, pruned_loss=0.08632, over 4711305.41 frames. ], batch size: 77, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:26:49,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:26:50,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 01:26:54,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:26:56,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:26:57,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:26:57,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:57,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:59,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 01:27:04,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 01:27:04,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:07,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:07,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=205133.33333333334, ans=0.125 2023-09-29 01:27:09,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:27:12,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:27:13,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:13,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:14,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.66 vs. limit=15.0 2023-09-29 01:27:15,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 01:27:15,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:17,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:17,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:27:17,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:27:19,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:27:21,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 01:27:22,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:27,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:27:29,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:27:31,116 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:27:32,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:27:33,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:27:36,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:27:36,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 01:27:36,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:27:36,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:27:39,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=205266.66666666666, ans=0.0 2023-09-29 01:27:42,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:27:45,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:48,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=205266.66666666666, ans=0.125 2023-09-29 01:27:52,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:27:55,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 01:27:57,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:28:02,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:28:04,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:05,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 01:28:11,989 INFO [train.py:1039] (2/4) Epoch 6, batch 4250, loss[loss=0.229, simple_loss=0.3038, pruned_loss=0.07709, over 24457.00 frames. ], tot_loss[loss=0.2328, simple_loss=0.2955, pruned_loss=0.08509, over 4724564.39 frames. ], batch size: 69, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:28:12,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:28:16,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:28:16,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:28:18,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:23,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:28:23,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 01:28:23,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:28:23,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=205400.0, ans=0.2 2023-09-29 01:28:27,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:29,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=205466.66666666666, ans=0.125 2023-09-29 01:28:30,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:31,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=205466.66666666666, ans=0.2 2023-09-29 01:28:35,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:37,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:38,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:28:38,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:28:40,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:42,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:42,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:43,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:28:45,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:28:47,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 01:28:49,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=205533.33333333334, ans=0.0 2023-09-29 01:28:51,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 01:28:51,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:53,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:53,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:53,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:28:53,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:55,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:58,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:28:58,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:29:03,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:05,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:07,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 01:29:07,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:29:08,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 01:29:08,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=205600.0, ans=0.125 2023-09-29 01:29:10,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:29:12,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:29:13,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:13,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:29:16,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 01:29:17,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:29:18,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:29:21,778 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.147e+02 2.416e+02 2.924e+02 5.280e+02, threshold=4.831e+02, percent-clipped=2.0 2023-09-29 01:29:22,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:25,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:26,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:29:26,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:29,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:30,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:29:33,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:29:33,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 01:29:35,411 INFO [train.py:1039] (2/4) Epoch 6, batch 4300, loss[loss=0.2474, simple_loss=0.2994, pruned_loss=0.0977, over 23287.00 frames. ], tot_loss[loss=0.2322, simple_loss=0.2948, pruned_loss=0.08484, over 4725359.18 frames. ], batch size: 119, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:29:35,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:41,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:41,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:29:45,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:56,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:56,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 01:29:56,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:29:59,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:29:59,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:29:59,380 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 01:30:02,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:30:06,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:09,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 01:30:09,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:30:09,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 01:30:12,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:30:13,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=205866.66666666666, ans=0.2 2023-09-29 01:30:14,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:30:17,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:30:17,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:30:17,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:30:19,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:19,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:30:21,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 01:30:21,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 01:30:24,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:30:28,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:30:28,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:28,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 01:30:28,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 01:30:28,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 01:30:31,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:30:31,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 01:30:31,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 01:30:35,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:35,939 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 01:30:37,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:30:39,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:39,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:41,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 01:30:43,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:43,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:44,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:30:44,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:44,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:30:46,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:30:49,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:50,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:52,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:55,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.98 vs. limit=15.0 2023-09-29 01:30:57,154 INFO [train.py:1039] (2/4) Epoch 6, batch 4350, loss[loss=0.2447, simple_loss=0.308, pruned_loss=0.0907, over 23261.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2966, pruned_loss=0.08622, over 4716895.76 frames. ], batch size: 105, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:30:58,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 01:30:58,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:30:59,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=206066.66666666666, ans=0.125 2023-09-29 01:31:04,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:06,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:11,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:31:11,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:31:17,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:31:20,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:23,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:31:23,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:31:27,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:31:30,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:31:32,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:31:37,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 01:31:37,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:39,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:43,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:44,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 01:31:50,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:31:50,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:31:54,691 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 01:31:56,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:31:56,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:31:57,801 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 01:31:59,240 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 01:31:59,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:00,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:02,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:32:03,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:04,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:04,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:06,029 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.197e+02 2.443e+02 2.898e+02 4.711e+02, threshold=4.887e+02, percent-clipped=0.0 2023-09-29 01:32:07,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 01:32:07,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:07,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 01:32:09,432 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 01:32:09,439 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 01:32:09,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 01:32:12,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:32:12,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:32:13,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:15,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:32:15,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 01:32:18,820 INFO [train.py:1039] (2/4) Epoch 6, batch 4400, loss[loss=0.231, simple_loss=0.2852, pruned_loss=0.08841, over 23454.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.2977, pruned_loss=0.08646, over 4725675.19 frames. ], batch size: 120, lr: 1.65e-02, grad_scale: 32.0 2023-09-29 01:32:19,021 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 01:32:19,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:23,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:23,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:25,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:27,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 01:32:28,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 01:32:28,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 01:32:28,893 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 01:32:30,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:32:30,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:32,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 01:32:33,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:34,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.61 vs. limit=10.0 2023-09-29 01:32:35,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:35,314 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 01:32:38,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:38,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 01:32:40,303 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 01:32:43,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 01:32:43,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 01:32:43,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 01:32:43,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:45,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:45,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:46,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:49,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 01:32:49,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 01:32:51,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:53,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:32:53,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:54,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:54,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:54,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 01:32:57,930 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 01:33:00,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:06,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:33:10,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 01:33:14,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:33:19,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:20,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:33:21,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=206600.0, ans=0.125 2023-09-29 01:33:22,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 01:33:22,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:33:22,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:33:22,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:33:22,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:33:29,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 01:33:33,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 01:33:34,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 01:33:34,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:33:34,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 01:33:36,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:33:40,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:33:41,386 INFO [train.py:1039] (2/4) Epoch 6, batch 4450, loss[loss=0.2572, simple_loss=0.3049, pruned_loss=0.1047, over 23688.00 frames. ], tot_loss[loss=0.2355, simple_loss=0.2981, pruned_loss=0.08648, over 4722619.71 frames. ], batch size: 164, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:33:41,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 01:33:44,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:48,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:49,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:33:54,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:33:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:33:59,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:00,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:34:04,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:34:05,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:05,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 01:34:05,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:07,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:07,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:07,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:34:11,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:34:16,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:17,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:19,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:20,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:20,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:34:26,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:34:26,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 01:34:26,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 01:34:26,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:34:29,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:29,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 01:34:32,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:34:36,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:37,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 01:34:37,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:37,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:37,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:34:37,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:40,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:41,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=206933.33333333334, ans=15.0 2023-09-29 01:34:45,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:34:46,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 01:34:46,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=207000.0, ans=0.125 2023-09-29 01:34:48,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:34:49,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:49,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:52,743 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 2.158e+02 2.416e+02 2.828e+02 3.801e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 01:34:52,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:52,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:34:54,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:34:54,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=207000.0, ans=0.0 2023-09-29 01:34:58,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 01:35:01,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:35:04,023 INFO [train.py:1039] (2/4) Epoch 6, batch 4500, loss[loss=0.243, simple_loss=0.3193, pruned_loss=0.08328, over 24642.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2992, pruned_loss=0.0868, over 4720750.38 frames. ], batch size: 73, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:35:05,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:07,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 01:35:07,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 01:35:08,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:16,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:35:16,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:16,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:35:18,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:35:18,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:18,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:32,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:34,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:35:36,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:35:36,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:35:37,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:35:39,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.59 vs. limit=15.0 2023-09-29 01:35:43,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:35:47,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:35:51,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:35:54,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:35:54,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 01:35:57,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:35:57,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:59,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:59,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:36:01,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:36:01,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 01:36:01,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:36:01,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:06,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:36:06,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:36:06,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=207266.66666666666, ans=0.125 2023-09-29 01:36:09,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:11,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:36:11,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:36:14,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 01:36:15,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 01:36:15,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 01:36:20,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 01:36:23,992 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:36:25,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 01:36:26,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:27,958 INFO [train.py:1039] (2/4) Epoch 6, batch 4550, loss[loss=0.2278, simple_loss=0.2983, pruned_loss=0.07866, over 24515.00 frames. ], tot_loss[loss=0.2348, simple_loss=0.2973, pruned_loss=0.08618, over 4716531.50 frames. ], batch size: 63, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:36:29,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:29,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:31,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=207400.0, ans=0.125 2023-09-29 01:36:34,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:36,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=207400.0, ans=0.035 2023-09-29 01:36:39,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=207400.0, ans=0.1 2023-09-29 01:36:40,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:36:42,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:36:45,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:36:45,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:36:45,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:47,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:47,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:47,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=207466.66666666666, ans=0.04949747468305833 2023-09-29 01:36:51,311 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-09-29 01:36:51,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:36:55,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 01:36:55,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 01:36:55,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=207466.66666666666, ans=0.1 2023-09-29 01:36:57,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:36:58,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 01:37:02,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 01:37:02,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:04,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.55 vs. limit=10.0 2023-09-29 01:37:05,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 01:37:08,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:37:10,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:37:14,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 01:37:16,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=207600.0, ans=0.1 2023-09-29 01:37:17,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:20,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:20,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:20,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=207600.0, ans=0.0 2023-09-29 01:37:21,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:23,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 01:37:24,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 01:37:25,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:37:25,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 01:37:26,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 01:37:26,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:27,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.81 vs. limit=15.0 2023-09-29 01:37:28,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:28,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:32,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:32,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:37:33,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:37:33,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 01:37:37,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:37,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:37:37,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 01:37:37,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:37:37,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 01:37:39,083 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 2.083e+02 2.307e+02 2.767e+02 3.692e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 01:37:42,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:37:42,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:37:42,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=207666.66666666666, ans=0.125 2023-09-29 01:37:44,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:37:45,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:46,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:37:47,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:37:50,410 INFO [train.py:1039] (2/4) Epoch 6, batch 4600, loss[loss=0.2363, simple_loss=0.2903, pruned_loss=0.09111, over 23456.00 frames. ], tot_loss[loss=0.2327, simple_loss=0.2948, pruned_loss=0.08537, over 4701913.69 frames. ], batch size: 134, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:37:50,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:37:52,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:53,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:53,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=207733.33333333334, ans=0.125 2023-09-29 01:37:56,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:37:56,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:37:58,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:37:59,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 01:38:00,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:38:05,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:38:05,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:08,195 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.11 vs. limit=12.0 2023-09-29 01:38:08,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:12,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=207800.0, ans=0.125 2023-09-29 01:38:17,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 01:38:18,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:20,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:21,513 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.13 vs. limit=6.0 2023-09-29 01:38:22,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:38:22,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:24,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=207866.66666666666, ans=0.025 2023-09-29 01:38:28,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 01:38:28,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:38:28,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:38:31,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=207866.66666666666, ans=0.1 2023-09-29 01:38:34,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:34,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:38:36,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:38:37,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=207866.66666666666, ans=0.125 2023-09-29 01:38:43,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 01:38:44,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:38:48,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:50,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:38:50,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=207933.33333333334, ans=0.09899494936611666 2023-09-29 01:38:53,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:53,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 01:38:53,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:55,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 01:38:55,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:55,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:38:58,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:58,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:00,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:00,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 01:39:00,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 01:39:00,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 01:39:00,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:02,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:03,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:05,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:06,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=208000.0, ans=0.2 2023-09-29 01:39:13,056 INFO [train.py:1039] (2/4) Epoch 6, batch 4650, loss[loss=0.2337, simple_loss=0.2952, pruned_loss=0.08607, over 23185.00 frames. ], tot_loss[loss=0.2318, simple_loss=0.2943, pruned_loss=0.08461, over 4704452.47 frames. ], batch size: 119, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:39:16,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:39:18,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:20,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:20,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:39:21,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:21,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:21,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:24,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=208066.66666666666, ans=0.0 2023-09-29 01:39:26,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 01:39:31,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:39:32,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 01:39:32,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:32,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 01:39:34,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:39:34,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 01:39:34,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 01:39:34,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:36,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:39:39,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:39:40,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:40,692 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 01:39:42,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:42,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=208133.33333333334, ans=0.95 2023-09-29 01:39:44,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 01:39:47,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:47,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:39:47,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 01:39:50,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:55,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:39:55,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=208200.0, ans=0.2 2023-09-29 01:39:58,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:00,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=208200.0, ans=0.2 2023-09-29 01:40:04,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:06,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:08,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:08,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:40:11,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 01:40:12,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 01:40:12,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 01:40:12,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 01:40:15,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:23,227 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 2.129e+02 2.354e+02 2.649e+02 3.887e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 01:40:23,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:40:23,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:23,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 01:40:25,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:27,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:27,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:40:29,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:40:32,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:40:32,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:34,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:36,313 INFO [train.py:1039] (2/4) Epoch 6, batch 4700, loss[loss=0.2563, simple_loss=0.3054, pruned_loss=0.1036, over 23317.00 frames. ], tot_loss[loss=0.2333, simple_loss=0.2958, pruned_loss=0.08544, over 4710100.58 frames. ], batch size: 119, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:40:36,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:37,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:40:38,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:40:38,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 01:40:39,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:40:41,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 01:40:49,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:49,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:50,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:40:51,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:53,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:40:56,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 01:40:58,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 01:41:02,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:02,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:41:02,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:41:07,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:13,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:41:15,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:41:18,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:41:25,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 01:41:26,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:41:27,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:31,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 01:41:34,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:41:40,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:41:40,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 01:41:43,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:43,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:46,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:47,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:41:47,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 01:41:48,651 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 01:41:50,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:53,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 01:41:54,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:57,802 INFO [train.py:1039] (2/4) Epoch 6, batch 4750, loss[loss=0.2074, simple_loss=0.2824, pruned_loss=0.06616, over 24683.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2963, pruned_loss=0.08571, over 4715892.43 frames. ], batch size: 65, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:41:58,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 01:41:59,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:42:01,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:04,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:06,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:42:07,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 01:42:07,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:11,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 01:42:13,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:42:15,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:42:15,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:19,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.84 vs. limit=15.0 2023-09-29 01:42:20,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 01:42:24,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:42:26,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 01:42:26,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:31,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:32,907 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 01:42:32,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 01:42:35,227 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.96 vs. limit=15.0 2023-09-29 01:42:37,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=208866.66666666666, ans=0.125 2023-09-29 01:42:39,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 01:42:40,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:42,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:42:46,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:42:46,596 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 01:42:46,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:42:50,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:42:54,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:42:55,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 01:42:55,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 01:42:57,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:57,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:42:57,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:57,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=208933.33333333334, ans=0.0 2023-09-29 01:42:58,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:42:58,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 01:43:00,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 01:43:03,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:06,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:43:06,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 01:43:06,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:09,479 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.147e+02 2.364e+02 2.785e+02 5.281e+02, threshold=4.728e+02, percent-clipped=1.0 2023-09-29 01:43:09,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:09,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:43:11,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:11,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:43:15,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:15,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 01:43:16,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 01:43:18,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 01:43:18,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=209000.0, ans=0.0 2023-09-29 01:43:21,131 INFO [train.py:1039] (2/4) Epoch 6, batch 4800, loss[loss=0.2321, simple_loss=0.3089, pruned_loss=0.07768, over 24641.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.2976, pruned_loss=0.08673, over 4706134.16 frames. ], batch size: 73, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:43:21,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:43:21,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=209066.66666666666, ans=0.1 2023-09-29 01:43:23,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:23,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 01:43:30,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:30,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:31,180 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.57 vs. limit=22.5 2023-09-29 01:43:36,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:43:37,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:37,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:39,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 01:43:40,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:40,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:43:42,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:43:48,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:43:49,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:50,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:43:51,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=209200.0, ans=0.125 2023-09-29 01:43:52,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:52,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:43:52,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:53,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:53,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=209200.0, ans=0.125 2023-09-29 01:43:56,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:58,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=209200.0, ans=0.125 2023-09-29 01:43:58,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=209200.0, ans=0.04949747468305833 2023-09-29 01:43:59,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:44:04,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:44:04,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:05,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=209200.0, ans=0.0 2023-09-29 01:44:07,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 01:44:07,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 01:44:07,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:07,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:44:09,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:44:09,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:09,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:44:11,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:44:11,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:13,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=209266.66666666666, ans=0.125 2023-09-29 01:44:13,636 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.44 vs. limit=12.0 2023-09-29 01:44:14,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:14,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=209266.66666666666, ans=0.0 2023-09-29 01:44:16,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=209266.66666666666, ans=0.125 2023-09-29 01:44:16,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=209266.66666666666, ans=0.125 2023-09-29 01:44:17,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:18,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:23,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 01:44:25,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:25,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:25,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:44:25,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=209333.33333333334, ans=0.125 2023-09-29 01:44:27,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:30,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:32,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:44:32,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:32,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:44:34,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:44:34,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:44:37,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:39,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:39,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:40,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 01:44:42,326 INFO [train.py:1039] (2/4) Epoch 6, batch 4850, loss[loss=0.2008, simple_loss=0.2666, pruned_loss=0.06753, over 24651.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.2975, pruned_loss=0.08653, over 4707610.51 frames. ], batch size: 60, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:44:42,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 01:44:42,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:44:42,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:45,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:46,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=209400.0, ans=0.2 2023-09-29 01:44:52,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 01:44:53,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:57,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:44:59,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:44:59,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:02,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:45:05,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:45:06,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:45:06,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 01:45:07,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.72 vs. limit=15.0 2023-09-29 01:45:12,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:45:14,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:45:14,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:45:15,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:45:15,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 01:45:18,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:45:19,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:23,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:24,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 01:45:24,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 01:45:27,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:45:34,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:45:34,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 01:45:36,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:45:36,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:45:38,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:45:42,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 01:45:42,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:42,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 01:45:44,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:44,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:45:45,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 01:45:53,386 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.040e+02 2.308e+02 2.752e+02 3.700e+02, threshold=4.617e+02, percent-clipped=0.0 2023-09-29 01:45:55,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:46:00,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:46:00,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:01,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=209666.66666666666, ans=0.125 2023-09-29 01:46:04,392 INFO [train.py:1039] (2/4) Epoch 6, batch 4900, loss[loss=0.2341, simple_loss=0.2724, pruned_loss=0.0979, over 22624.00 frames. ], tot_loss[loss=0.2346, simple_loss=0.2966, pruned_loss=0.08634, over 4704028.14 frames. ], batch size: 322, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:46:06,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 01:46:06,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:46:11,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:13,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:13,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:46:17,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 01:46:23,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 01:46:26,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 01:46:28,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 01:46:28,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:29,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:29,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:46:29,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:29,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:46:29,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 01:46:33,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 01:46:33,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:46:34,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:46:36,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:36,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=209866.66666666666, ans=0.0 2023-09-29 01:46:39,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:46:39,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:42,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:46:42,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 01:46:44,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:46:44,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:44,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 01:46:44,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 01:46:45,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=209866.66666666666, ans=0.125 2023-09-29 01:46:51,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 01:46:53,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:46:54,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:46:54,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:46:55,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.85 vs. limit=15.0 2023-09-29 01:46:56,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:56,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:46:56,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:46:56,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 01:46:59,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:01,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:47:01,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=209933.33333333334, ans=0.125 2023-09-29 01:47:02,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:47:05,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 01:47:06,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=209933.33333333334, ans=0.0 2023-09-29 01:47:07,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:47:07,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 01:47:07,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 01:47:13,176 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.21 vs. limit=15.0 2023-09-29 01:47:14,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:15,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:47:18,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 01:47:18,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:18,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:47:19,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:22,507 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=12.0 2023-09-29 01:47:26,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:26,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:47:26,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:27,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:47:28,391 INFO [train.py:1039] (2/4) Epoch 6, batch 4950, loss[loss=0.231, simple_loss=0.3096, pruned_loss=0.07621, over 24282.00 frames. ], tot_loss[loss=0.2329, simple_loss=0.295, pruned_loss=0.08538, over 4715577.10 frames. ], batch size: 74, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:47:28,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:47:31,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:31,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:36,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 01:47:36,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 01:47:37,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:47:37,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 01:47:37,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:39,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:47:39,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:47:39,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:47:41,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:42,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:47:44,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:47:44,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:46,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:46,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:48,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=210133.33333333334, ans=0.2 2023-09-29 01:47:51,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:47:57,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:59,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:48:02,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:03,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:03,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:48:05,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 01:48:05,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 01:48:08,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:10,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:48:10,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:48:11,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:48:11,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:48:13,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:48:14,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:17,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:48:20,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:48:24,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:24,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:25,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 01:48:25,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:48:26,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=210266.66666666666, ans=0.0 2023-09-29 01:48:27,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:48:31,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:48:32,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:48:32,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:48:35,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:35,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:48:35,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=210333.33333333334, ans=0.0 2023-09-29 01:48:36,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:48:38,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:48:38,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:48:39,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:41,011 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.165e+02 2.494e+02 2.935e+02 4.100e+02, threshold=4.988e+02, percent-clipped=0.0 2023-09-29 01:48:41,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 01:48:44,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:48:49,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 01:48:49,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:48:50,382 INFO [train.py:1039] (2/4) Epoch 6, batch 5000, loss[loss=0.2497, simple_loss=0.3039, pruned_loss=0.09772, over 23772.00 frames. ], tot_loss[loss=0.2319, simple_loss=0.2945, pruned_loss=0.08466, over 4722646.99 frames. ], batch size: 179, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:48:57,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:57,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:48:58,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 01:49:00,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 01:49:00,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=210400.0, ans=0.125 2023-09-29 01:49:01,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:05,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 01:49:06,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:49:06,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:49:08,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 01:49:08,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:08,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:08,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 01:49:08,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:08,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:09,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=210466.66666666666, ans=0.1 2023-09-29 01:49:11,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 01:49:12,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 01:49:13,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=210466.66666666666, ans=0.125 2023-09-29 01:49:14,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:49:14,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 01:49:14,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:49:14,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:14,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=210466.66666666666, ans=0.1 2023-09-29 01:49:15,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:49:15,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 01:49:15,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 01:49:18,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 01:49:18,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:18,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:20,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 01:49:20,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:49:21,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:23,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:23,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 01:49:24,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 01:49:26,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:49:28,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:49:31,884 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 01:49:35,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:36,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:36,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:49:38,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=210600.0, ans=0.015 2023-09-29 01:49:41,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 01:49:41,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:41,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:42,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:49:45,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:49:45,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:47,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=210600.0, ans=0.125 2023-09-29 01:49:48,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:50,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:56,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 01:50:01,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:08,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=210666.66666666666, ans=0.05 2023-09-29 01:50:08,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=210666.66666666666, ans=0.125 2023-09-29 01:50:09,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:50:11,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:11,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:50:11,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:13,243 INFO [train.py:1039] (2/4) Epoch 6, batch 5050, loss[loss=0.2468, simple_loss=0.3165, pruned_loss=0.08852, over 23923.00 frames. ], tot_loss[loss=0.2315, simple_loss=0.2942, pruned_loss=0.08439, over 4716332.84 frames. ], batch size: 86, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:50:13,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:50:13,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:50:13,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:18,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:18,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 01:50:20,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:50:21,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:21,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:50:23,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 01:50:23,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:23,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:50:26,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:50:27,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:50:29,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:50:34,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.61 vs. limit=15.0 2023-09-29 01:50:35,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=210800.0, ans=0.125 2023-09-29 01:50:40,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 01:50:42,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:50:42,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:50:42,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 01:50:42,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:50:43,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-09-29 01:50:43,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:45,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:46,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:50:46,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 01:50:46,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 01:50:49,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:52,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:50:54,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:54,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 01:50:55,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:50:58,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 01:51:00,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:51:00,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:51:02,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:03,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:51:05,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:08,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:51:08,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:09,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:51:09,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:51:09,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 01:51:11,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:51:13,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:51:17,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:51:17,895 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 01:51:17,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:51:19,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:19,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:21,323 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 01:51:24,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.235e+02 2.528e+02 3.059e+02 5.158e+02, threshold=5.056e+02, percent-clipped=2.0 2023-09-29 01:51:24,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:24,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 01:51:24,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:28,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:29,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:29,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 01:51:30,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.34 vs. limit=22.5 2023-09-29 01:51:31,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 01:51:32,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:32,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:51:33,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:51:34,396 INFO [train.py:1039] (2/4) Epoch 6, batch 5100, loss[loss=0.3463, simple_loss=0.3696, pruned_loss=0.1615, over 19224.00 frames. ], tot_loss[loss=0.2325, simple_loss=0.2955, pruned_loss=0.08476, over 4719368.88 frames. ], batch size: 388, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:51:35,387 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=12.0 2023-09-29 01:51:36,181 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 01:51:39,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:43,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 01:51:43,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 01:51:44,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:46,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:50,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:50,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 01:51:50,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 01:51:54,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:56,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:52:00,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:52:04,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 01:52:05,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:06,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:52:06,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:52:09,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 01:52:11,384 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 01:52:12,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:12,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 01:52:14,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 01:52:16,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=211200.0, ans=0.125 2023-09-29 01:52:18,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:30,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:52:31,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 01:52:31,874 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 01:52:33,963 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 01:52:36,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 01:52:36,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:39,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 01:52:43,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 01:52:43,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=211333.33333333334, ans=0.125 2023-09-29 01:52:44,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:52:47,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:52:47,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=211333.33333333334, ans=0.2 2023-09-29 01:52:49,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 01:52:49,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:52:50,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 01:52:55,331 INFO [train.py:1039] (2/4) Epoch 6, batch 5150, loss[loss=0.2997, simple_loss=0.3317, pruned_loss=0.1339, over 19544.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.297, pruned_loss=0.08529, over 4713837.87 frames. ], batch size: 388, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:52:55,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:52:55,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:52:55,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:52:57,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:52:59,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:52:59,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:53:00,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 01:53:00,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 01:53:00,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 01:53:00,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:53:00,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 01:53:03,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:03,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 01:53:05,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:06,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:12,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:53:12,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 01:53:14,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:15,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:53:17,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:53:17,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:17,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:17,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:53:17,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:53:18,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 01:53:20,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:53:20,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:53:23,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:53:25,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 01:53:27,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:53:33,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:53:35,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 01:53:39,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:46,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:49,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:52,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:53:52,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:53:55,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 01:53:59,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:59,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:53:59,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:54:03,310 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.80 vs. limit=15.0 2023-09-29 01:54:04,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:05,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:54:05,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 01:54:07,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=211666.66666666666, ans=0.125 2023-09-29 01:54:08,369 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.054e+02 2.300e+02 2.668e+02 5.365e+02, threshold=4.600e+02, percent-clipped=1.0 2023-09-29 01:54:11,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:13,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:54:14,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:54:14,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:54:14,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:54:16,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:54:16,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:54:16,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:54:17,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=211733.33333333334, ans=0.1 2023-09-29 01:54:18,892 INFO [train.py:1039] (2/4) Epoch 6, batch 5200, loss[loss=0.3368, simple_loss=0.3709, pruned_loss=0.1513, over 19588.00 frames. ], tot_loss[loss=0.234, simple_loss=0.2974, pruned_loss=0.08528, over 4708884.69 frames. ], batch size: 388, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:54:22,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:54:24,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:54:27,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:28,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 01:54:30,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:54:30,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:32,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:34,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:54:34,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:37,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 01:54:40,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:54:41,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:43,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 01:54:46,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:54:46,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=211800.0, ans=0.0 2023-09-29 01:54:48,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:54:48,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 01:54:48,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 01:54:51,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 01:54:51,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:51,930 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 01:54:51,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:56,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:56,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:54:57,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 01:54:57,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:55:01,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:03,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 01:55:05,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 01:55:05,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 01:55:10,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 01:55:11,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:55:16,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:55:16,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:17,208 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.56 vs. limit=15.0 2023-09-29 01:55:18,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 01:55:19,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:19,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:55:19,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:19,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:55:24,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:24,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:55:30,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:55:32,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:32,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:33,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=212000.0, ans=0.125 2023-09-29 01:55:36,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:38,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 01:55:39,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:39,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:55:41,379 INFO [train.py:1039] (2/4) Epoch 6, batch 5250, loss[loss=0.2273, simple_loss=0.2897, pruned_loss=0.08249, over 19915.00 frames. ], tot_loss[loss=0.2333, simple_loss=0.2961, pruned_loss=0.08518, over 4713878.45 frames. ], batch size: 43, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:55:41,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:41,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:55:41,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:55:45,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:55:48,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:48,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:55:49,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:55:56,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:57,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:56:00,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:56:01,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:56:05,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 01:56:05,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:56:06,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:56:21,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.42 vs. limit=22.5 2023-09-29 01:56:29,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=212266.66666666666, ans=0.125 2023-09-29 01:56:30,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=212266.66666666666, ans=0.0 2023-09-29 01:56:41,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=212333.33333333334, ans=0.125 2023-09-29 01:56:46,950 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.177e+02 2.529e+02 3.121e+02 4.794e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 01:56:47,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=212333.33333333334, ans=0.125 2023-09-29 01:56:55,328 INFO [train.py:1039] (2/4) Epoch 6, batch 5300, loss[loss=0.2482, simple_loss=0.3205, pruned_loss=0.08795, over 24358.00 frames. ], tot_loss[loss=0.2329, simple_loss=0.2954, pruned_loss=0.0852, over 4708690.01 frames. ], batch size: 77, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:57:11,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:57:11,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 01:57:11,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 01:57:11,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:11,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:11,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:11,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:11,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:11,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:12,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:12,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:57:12,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:57:12,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 01:57:12,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 01:57:12,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 01:57:13,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:57:13,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 01:57:13,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 01:57:13,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:14,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:14,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:14,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:14,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:57:15,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:15,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:15,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:15,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:15,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:15,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:57:15,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:15,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:57:16,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 01:57:16,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:17,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:17,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 01:57:17,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 01:57:17,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:57:17,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:17,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 01:57:17,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 01:57:18,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:18,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:57:19,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:19,272 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 01:57:19,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 01:57:19,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:57:19,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:19,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 01:57:19,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 01:57:19,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 01:57:20,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:30,919 INFO [train.py:1039] (2/4) Epoch 7, batch 0, loss[loss=0.2116, simple_loss=0.2818, pruned_loss=0.07068, over 24572.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2818, pruned_loss=0.07068, over 24572.00 frames. ], batch size: 60, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:57:30,920 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 01:57:45,837 INFO [train.py:1071] (2/4) Epoch 7, validation: loss=0.2938, simple_loss=0.3001, pruned_loss=0.1437, over 1125622.00 frames. 2023-09-29 01:57:45,838 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 01:57:47,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 01:57:48,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:57:51,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:57:56,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:56,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:57:58,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:58,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 01:58:00,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 01:58:03,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:03,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:05,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=212546.66666666666, ans=0.0 2023-09-29 01:58:07,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:07,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:09,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:58:09,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:10,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 01:58:12,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:22,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:58:22,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:24,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 01:58:27,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:58:29,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:58:29,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=212613.33333333334, ans=0.2 2023-09-29 01:58:30,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:35,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:58:40,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:46,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 01:58:51,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 01:58:51,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:58:51,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:51,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:58:53,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:54,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 01:58:56,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:57,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:59:01,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:04,546 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 01:59:05,956 INFO [train.py:1039] (2/4) Epoch 7, batch 50, loss[loss=0.2474, simple_loss=0.3226, pruned_loss=0.08615, over 24662.00 frames. ], tot_loss[loss=0.2315, simple_loss=0.2963, pruned_loss=0.08338, over 1073365.00 frames. ], batch size: 73, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:59:06,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:59:09,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:10,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=212813.33333333334, ans=0.2 2023-09-29 01:59:11,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:11,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 01:59:11,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:59:12,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:59:15,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:16,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:17,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=212813.33333333334, ans=0.125 2023-09-29 01:59:20,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:22,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 01:59:22,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:31,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:59:31,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 01:59:32,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 01:59:36,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:59:36,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:59:38,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:38,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:59:39,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:59:39,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:59:39,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:48,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:59:49,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:49,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:59:51,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 01:59:54,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:59:54,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:59:54,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 01:59:55,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:58,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 02:00:00,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=213013.33333333334, ans=0.1 2023-09-29 02:00:01,124 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.205e+02 2.565e+02 2.922e+02 4.560e+02, threshold=5.129e+02, percent-clipped=0.0 2023-09-29 02:00:06,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:06,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:00:06,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:08,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:09,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:11,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 02:00:11,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 02:00:13,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:13,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:15,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:00:16,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:00:16,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 02:00:16,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 02:00:18,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 02:00:20,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:20,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:00:21,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 02:00:21,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 02:00:21,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:23,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:24,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:00:24,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:00:27,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:00:29,759 INFO [train.py:1039] (2/4) Epoch 7, batch 100, loss[loss=0.2499, simple_loss=0.3047, pruned_loss=0.09761, over 23424.00 frames. ], tot_loss[loss=0.2318, simple_loss=0.2962, pruned_loss=0.08376, over 1884010.25 frames. ], batch size: 285, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:00:34,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:00:36,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:38,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 02:00:38,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:41,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:00:41,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:41,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:41,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:41,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:42,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 02:00:45,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=213213.33333333334, ans=0.1 2023-09-29 02:00:46,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:00:46,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:46,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:46,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:51,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 02:00:52,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:52,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:54,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:00:56,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:00:56,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=213213.33333333334, ans=0.125 2023-09-29 02:00:59,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=213213.33333333334, ans=0.125 2023-09-29 02:01:01,082 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 02:01:01,126 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 02:01:04,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:04,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:01:10,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:01:11,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:01:13,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:21,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:22,083 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 02:01:25,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:01:28,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:01:29,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=213346.66666666666, ans=0.125 2023-09-29 02:01:30,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:01:33,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:34,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:38,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:39,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.28 vs. limit=15.0 2023-09-29 02:01:40,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:01:43,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:45,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:47,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:47,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:01:47,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:47,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 02:01:47,571 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 02:01:47,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:49,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:01:49,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:49,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:50,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 02:01:50,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:01:50,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:01:50,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:50,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:52,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:53,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:01:54,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:01:57,176 INFO [train.py:1039] (2/4) Epoch 7, batch 150, loss[loss=0.2099, simple_loss=0.2758, pruned_loss=0.07202, over 24614.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2974, pruned_loss=0.08576, over 2496206.13 frames. ], batch size: 60, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:01:57,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:58,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:58,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:00,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:02,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:04,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:07,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:02:07,238 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=213480.0, ans=0.125 2023-09-29 02:02:08,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:13,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 02:02:13,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 02:02:13,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 02:02:15,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:02:15,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:02:17,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:02:19,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:02:19,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:19,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:21,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:22,585 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 02:02:24,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:30,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:33,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:02:37,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 02:02:39,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:02:41,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:41,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:02:42,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:02:45,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:45,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:02:46,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:48,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 02:02:51,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 2.019e+02 2.375e+02 2.708e+02 4.033e+02, threshold=4.751e+02, percent-clipped=0.0 2023-09-29 02:02:55,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:55,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:02:55,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:02:55,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:02:58,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:58,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=213680.0, ans=0.0 2023-09-29 02:02:59,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 02:03:01,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:03:02,313 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=12.0 2023-09-29 02:03:02,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:03:04,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:06,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:03:06,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 02:03:06,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:03:06,142 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 02:03:12,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:15,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:03:17,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:03:19,253 INFO [train.py:1039] (2/4) Epoch 7, batch 200, loss[loss=0.2237, simple_loss=0.2961, pruned_loss=0.07571, over 24452.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2976, pruned_loss=0.08566, over 2985326.24 frames. ], batch size: 63, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:03:20,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 02:03:22,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:23,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:24,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=213813.33333333334, ans=0.0 2023-09-29 02:03:24,448 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.57 vs. limit=22.5 2023-09-29 02:03:26,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 02:03:28,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:03:31,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:31,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=213813.33333333334, ans=0.125 2023-09-29 02:03:32,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:03:34,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:03:34,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:34,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:38,175 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.98 vs. limit=12.0 2023-09-29 02:03:53,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:03:53,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:03:55,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:03:55,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:03:57,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:03:57,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:03:58,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:00,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:04:02,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:02,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:03,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 02:04:03,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:04:05,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:07,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:04:11,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=214013.33333333334, ans=10.0 2023-09-29 02:04:15,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:17,765 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.66 vs. limit=10.0 2023-09-29 02:04:22,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:22,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=214013.33333333334, ans=0.1 2023-09-29 02:04:23,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:04:30,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:30,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=214080.0, ans=0.1 2023-09-29 02:04:33,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 02:04:34,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:34,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:04:35,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:35,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=214080.0, ans=0.125 2023-09-29 02:04:37,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:04:37,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 02:04:37,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:04:38,698 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 02:04:40,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:42,340 INFO [train.py:1039] (2/4) Epoch 7, batch 250, loss[loss=0.2578, simple_loss=0.3239, pruned_loss=0.0959, over 23730.00 frames. ], tot_loss[loss=0.2327, simple_loss=0.2961, pruned_loss=0.08465, over 3369840.32 frames. ], batch size: 85, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:04:42,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:04:43,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:43,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:45,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:04:45,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:48,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:04:51,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:04:57,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.05 vs. limit=22.5 2023-09-29 02:05:00,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=214213.33333333334, ans=0.1 2023-09-29 02:05:03,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:04,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=214213.33333333334, ans=0.125 2023-09-29 02:05:06,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:06,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:05:11,139 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.89 vs. limit=22.5 2023-09-29 02:05:15,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:05:16,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:05:18,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:05:18,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:19,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:05:19,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:05:20,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:20,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=214280.0, ans=0.0 2023-09-29 02:05:23,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:05:24,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 02:05:24,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:27,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:05:27,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:05:27,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:05:29,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:05:29,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:05:29,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:05:32,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:34,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:05:34,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:35,828 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.070e+02 2.355e+02 2.714e+02 5.110e+02, threshold=4.709e+02, percent-clipped=2.0 2023-09-29 02:05:39,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:05:39,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=214346.66666666666, ans=0.125 2023-09-29 02:05:46,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:49,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:05:54,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:56,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:05:59,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 02:05:59,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:59,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:06:02,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 02:06:02,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:06:04,128 INFO [train.py:1039] (2/4) Epoch 7, batch 300, loss[loss=0.2149, simple_loss=0.2482, pruned_loss=0.09081, over 19298.00 frames. ], tot_loss[loss=0.2307, simple_loss=0.294, pruned_loss=0.08375, over 3661707.14 frames. ], batch size: 388, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:06:04,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:06:04,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 02:06:09,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:10,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:12,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:06:14,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 02:06:16,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:06:18,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:06:18,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 02:06:18,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:23,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:06:23,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=214546.66666666666, ans=0.1 2023-09-29 02:06:25,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=214546.66666666666, ans=0.0 2023-09-29 02:06:27,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:06:27,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 02:06:30,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 02:06:30,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:30,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=214546.66666666666, ans=0.125 2023-09-29 02:06:33,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:34,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:34,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 02:06:34,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:06:39,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:06:41,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:06:41,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:06:46,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:06:46,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 02:06:48,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:06:48,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=214613.33333333334, ans=0.125 2023-09-29 02:06:50,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:52,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 02:06:53,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:57,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:06:59,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:59,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 02:07:00,142 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.67 vs. limit=15.0 2023-09-29 02:07:04,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:04,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:07:07,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:10,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:07:10,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 02:07:10,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:07:11,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:11,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 02:07:15,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:17,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:18,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:18,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:18,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:25,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:25,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 02:07:27,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:29,107 INFO [train.py:1039] (2/4) Epoch 7, batch 350, loss[loss=0.2181, simple_loss=0.3014, pruned_loss=0.06742, over 24308.00 frames. ], tot_loss[loss=0.229, simple_loss=0.2927, pruned_loss=0.08267, over 3884910.21 frames. ], batch size: 74, lr: 1.52e-02, grad_scale: 16.0 2023-09-29 02:07:34,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:39,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:39,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:40,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.57 vs. limit=10.0 2023-09-29 02:07:40,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 02:07:43,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:43,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 02:07:45,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:46,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 02:07:47,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:48,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=214880.0, ans=0.0 2023-09-29 02:07:50,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 02:07:52,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:07:54,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:57,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:07:58,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:07:58,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:58,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:08:02,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:02,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:10,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:10,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:08:12,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:08:12,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:18,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 02:08:18,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:24,061 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.070e+02 2.347e+02 2.700e+02 4.079e+02, threshold=4.694e+02, percent-clipped=0.0 2023-09-29 02:08:24,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:24,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:24,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:08:27,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 02:08:29,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:30,834 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 02:08:32,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 02:08:32,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:35,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:08:35,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 02:08:38,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:41,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:08:41,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:42,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:42,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:44,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:48,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:51,137 INFO [train.py:1039] (2/4) Epoch 7, batch 400, loss[loss=0.2352, simple_loss=0.2967, pruned_loss=0.08686, over 23840.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2917, pruned_loss=0.08207, over 4075068.65 frames. ], batch size: 195, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:08:51,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:08:51,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 02:08:51,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:52,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:54,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:08:54,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:08:57,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:57,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:08:59,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=215146.66666666666, ans=0.09899494936611666 2023-09-29 02:09:01,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 02:09:02,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 02:09:02,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:04,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 02:09:04,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:10,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:09:10,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:10,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 02:09:11,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:09:12,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:12,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:13,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:09:14,880 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:09:14,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=215213.33333333334, ans=0.125 2023-09-29 02:09:16,798 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 02:09:16,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 02:09:21,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:21,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=215213.33333333334, ans=0.125 2023-09-29 02:09:22,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:24,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 02:09:25,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 02:09:27,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=215280.0, ans=0.07 2023-09-29 02:09:28,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:09:30,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=215280.0, ans=0.0 2023-09-29 02:09:31,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:38,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 02:09:40,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=215346.66666666666, ans=0.125 2023-09-29 02:09:40,488 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:09:41,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:09:41,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 02:09:42,593 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.42 vs. limit=10.0 2023-09-29 02:09:45,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:47,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:09:47,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 02:09:50,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:09:53,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:09:54,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:56,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:56,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 02:09:59,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:09:59,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 02:10:02,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:10:02,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:10:04,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 02:10:06,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=215413.33333333334, ans=0.125 2023-09-29 02:10:07,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.83 vs. limit=15.0 2023-09-29 02:10:07,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:10:07,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:10:07,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:10:09,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 02:10:09,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:10:11,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:10:11,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:10:11,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 02:10:12,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:10:14,719 INFO [train.py:1039] (2/4) Epoch 7, batch 450, loss[loss=0.2012, simple_loss=0.2703, pruned_loss=0.06603, over 24343.00 frames. ], tot_loss[loss=0.2284, simple_loss=0.2924, pruned_loss=0.0822, over 4230403.89 frames. ], batch size: 56, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:10:14,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:10:16,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=215480.0, ans=0.1 2023-09-29 02:10:18,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:10:23,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.97 vs. limit=22.5 2023-09-29 02:10:29,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:29,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:10:29,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=215546.66666666666, ans=0.5 2023-09-29 02:10:32,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 02:10:32,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 02:10:37,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:10:38,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:40,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:44,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=215546.66666666666, ans=0.125 2023-09-29 02:10:45,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:48,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 02:10:48,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 02:10:50,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 02:10:52,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:10:52,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:54,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:10:57,301 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 02:10:57,325 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 02:10:57,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:58,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:11:00,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:11:04,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:11:04,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:11:04,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:11:04,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=215680.0, ans=0.05 2023-09-29 02:11:04,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=215680.0, ans=0.1 2023-09-29 02:11:05,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 02:11:07,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=215680.0, ans=0.125 2023-09-29 02:11:07,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=215680.0, ans=0.0 2023-09-29 02:11:08,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:10,022 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.891e+02 2.155e+02 2.361e+02 4.169e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 02:11:10,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:11:10,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:11:11,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 02:11:16,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:11:18,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 02:11:18,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 02:11:20,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:24,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=215746.66666666666, ans=0.125 2023-09-29 02:11:26,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:11:26,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:30,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:11:30,311 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 02:11:33,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:35,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:11:35,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:35,735 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 02:11:36,813 INFO [train.py:1039] (2/4) Epoch 7, batch 500, loss[loss=0.2398, simple_loss=0.2949, pruned_loss=0.09234, over 23632.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.2926, pruned_loss=0.08293, over 4336158.24 frames. ], batch size: 232, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:11:37,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 02:11:37,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:41,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:11:44,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 02:11:46,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:11:47,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:48,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:49,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:01,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:01,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:12:03,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:12:03,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:03,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 02:12:05,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:12:08,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:12:09,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:12:11,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:12:11,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:13,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 02:12:14,905 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 02:12:17,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:19,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:21,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:21,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:22,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:12:24,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 02:12:28,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:12:29,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:34,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:12:38,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:42,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=216080.0, ans=22.5 2023-09-29 02:12:45,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:50,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 02:12:50,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:50,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:51,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 02:12:53,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:12:53,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:58,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=216146.66666666666, ans=0.1 2023-09-29 02:12:58,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=216146.66666666666, ans=0.125 2023-09-29 02:12:59,425 INFO [train.py:1039] (2/4) Epoch 7, batch 550, loss[loss=0.2291, simple_loss=0.2974, pruned_loss=0.08037, over 23515.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.2934, pruned_loss=0.08372, over 4422839.64 frames. ], batch size: 93, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:12:59,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 02:13:02,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 02:13:02,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:02,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 02:13:02,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:13:02,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:04,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:13:05,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:13:08,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:13:09,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 02:13:09,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:13:15,145 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.78 vs. limit=15.0 2023-09-29 02:13:16,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=216213.33333333334, ans=0.0 2023-09-29 02:13:17,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:18,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:18,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=216213.33333333334, ans=0.125 2023-09-29 02:13:19,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:21,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:26,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=216213.33333333334, ans=0.0 2023-09-29 02:13:27,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 02:13:27,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=216213.33333333334, ans=0.0 2023-09-29 02:13:29,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 02:13:30,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:13:35,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:13:35,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:35,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=216280.0, ans=0.2 2023-09-29 02:13:36,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:13:40,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:40,118 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 02:13:40,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=216280.0, ans=0.015 2023-09-29 02:13:42,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:43,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:13:45,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:47,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:13:47,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:13:49,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 02:13:51,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 02:13:51,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=216346.66666666666, ans=0.0 2023-09-29 02:13:52,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:13:52,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:54,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:13:54,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:55,675 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.132e+02 2.384e+02 2.779e+02 4.607e+02, threshold=4.767e+02, percent-clipped=1.0 2023-09-29 02:13:56,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:13:59,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:13:59,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=216346.66666666666, ans=0.125 2023-09-29 02:14:01,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:14:02,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:02,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 02:14:04,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:14:05,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:05,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:14:07,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:08,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:14:09,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:14:13,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 02:14:19,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 02:14:22,410 INFO [train.py:1039] (2/4) Epoch 7, batch 600, loss[loss=0.2058, simple_loss=0.2696, pruned_loss=0.07101, over 24430.00 frames. ], tot_loss[loss=0.2313, simple_loss=0.2942, pruned_loss=0.08421, over 4492358.03 frames. ], batch size: 58, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:14:22,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:14:22,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:14:22,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:28,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=216480.0, ans=0.125 2023-09-29 02:14:31,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:14:34,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:14:34,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 02:14:37,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:14:37,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=216546.66666666666, ans=0.5 2023-09-29 02:14:39,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:14:42,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:45,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 02:14:45,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:14:45,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=216546.66666666666, ans=0.0 2023-09-29 02:14:45,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=216546.66666666666, ans=0.0 2023-09-29 02:14:50,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 02:14:55,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:14:55,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:56,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:15:01,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:15:01,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:15:03,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:04,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=216613.33333333334, ans=0.0 2023-09-29 02:15:12,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:15:14,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:16,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:15:16,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:15:22,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 02:15:27,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:15:27,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:15:33,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 02:15:33,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:15:37,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 02:15:38,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:15:38,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=216746.66666666666, ans=0.125 2023-09-29 02:15:39,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:15:44,077 INFO [train.py:1039] (2/4) Epoch 7, batch 650, loss[loss=0.2068, simple_loss=0.2747, pruned_loss=0.06951, over 24269.00 frames. ], tot_loss[loss=0.2305, simple_loss=0.2932, pruned_loss=0.0839, over 4529785.01 frames. ], batch size: 61, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:15:45,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:15:47,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:15:48,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:15:51,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:15:51,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:15:54,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 02:15:56,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:16:01,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:16:01,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:04,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:07,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 02:16:11,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:11,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:16,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:16,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:16:18,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:19,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:21,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:16:21,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:22,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:16:23,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.31 vs. limit=15.0 2023-09-29 02:16:24,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:16:25,701 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 02:16:25,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:25,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:27,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.53 vs. limit=6.0 2023-09-29 02:16:28,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:30,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:30,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:30,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:16:34,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 02:16:34,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:16:34,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:16:36,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:16:37,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:38,838 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.233e+02 2.449e+02 2.795e+02 3.907e+02, threshold=4.898e+02, percent-clipped=0.0 2023-09-29 02:16:38,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:16:40,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 02:16:40,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 02:16:42,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:42,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:42,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:16:42,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:44,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:51,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:51,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:52,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:54,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:54,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:16:56,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:17:02,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:17:02,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:03,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:03,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:05,227 INFO [train.py:1039] (2/4) Epoch 7, batch 700, loss[loss=0.2131, simple_loss=0.2763, pruned_loss=0.07495, over 24424.00 frames. ], tot_loss[loss=0.229, simple_loss=0.2915, pruned_loss=0.08323, over 4570360.52 frames. ], batch size: 58, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:17:10,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 02:17:10,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 02:17:14,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 02:17:15,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:18,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:17:21,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 02:17:24,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:27,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:17:28,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:30,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:17:31,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:17:34,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:37,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:17:37,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:17:40,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 02:17:45,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 02:17:48,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:17:49,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:17:50,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.32 vs. limit=15.0 2023-09-29 02:17:52,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:17:56,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=217346.66666666666, ans=0.1 2023-09-29 02:17:56,384 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=18.07 vs. limit=15.0 2023-09-29 02:17:57,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:17:57,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 02:18:01,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:02,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:18:02,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 02:18:06,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:18:08,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:11,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:18:14,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:18:14,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 02:18:18,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 02:18:20,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 02:18:23,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:25,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:25,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:18:29,130 INFO [train.py:1039] (2/4) Epoch 7, batch 750, loss[loss=0.2083, simple_loss=0.2728, pruned_loss=0.07186, over 24432.00 frames. ], tot_loss[loss=0.2277, simple_loss=0.2907, pruned_loss=0.08233, over 4601950.48 frames. ], batch size: 58, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:18:29,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:29,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 02:18:29,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=217480.0, ans=0.125 2023-09-29 02:18:32,633 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:18:33,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 02:18:33,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 02:18:33,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 02:18:35,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 02:18:35,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 02:18:35,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:18:38,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 02:18:38,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:39,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:18:41,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:42,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:44,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:18:44,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:46,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=217546.66666666666, ans=0.0 2023-09-29 02:18:48,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.52 vs. limit=15.0 2023-09-29 02:18:48,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:18:50,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:18:52,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:18:54,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:56,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:56,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 02:18:57,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:18:58,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:00,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:01,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:19:03,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 02:19:03,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:05,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 02:19:05,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 02:19:05,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 02:19:05,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:19:07,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:19:07,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:19:10,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.16 vs. limit=22.5 2023-09-29 02:19:14,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:19:14,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:14,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:19:17,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:19:19,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:19,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 02:19:21,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:19:23,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 02:19:23,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:19:24,699 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.977e+02 2.216e+02 2.498e+02 3.508e+02, threshold=4.433e+02, percent-clipped=0.0 2023-09-29 02:19:24,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:19:26,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 02:19:28,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:29,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.14 vs. limit=15.0 2023-09-29 02:19:32,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:19:33,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:19:34,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:19:35,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:19:40,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 02:19:41,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:43,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:45,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:46,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:48,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:48,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:19:51,113 INFO [train.py:1039] (2/4) Epoch 7, batch 800, loss[loss=0.2475, simple_loss=0.3021, pruned_loss=0.09638, over 22900.00 frames. ], tot_loss[loss=0.2294, simple_loss=0.2924, pruned_loss=0.08319, over 4613843.45 frames. ], batch size: 322, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:19:51,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=217813.33333333334, ans=0.125 2023-09-29 02:19:56,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=217813.33333333334, ans=0.0 2023-09-29 02:19:57,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:57,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:59,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:59,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:00,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:00,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:04,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:06,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=217880.0, ans=0.0 2023-09-29 02:20:08,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:09,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:20:12,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 02:20:12,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:14,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:14,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:20:15,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:15,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 02:20:15,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:16,002 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:20:17,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 02:20:19,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:22,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:20:23,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:25,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:25,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:28,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=217946.66666666666, ans=0.2 2023-09-29 02:20:30,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:20:31,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:20:31,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 02:20:33,983 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 02:20:34,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 02:20:35,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:20:35,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:20:38,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:39,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:20:44,425 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 02:20:44,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 02:20:44,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:20:44,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=218013.33333333334, ans=0.0 2023-09-29 02:20:47,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:20:51,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:20:55,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:56,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 02:20:56,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:20:59,240 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.35 vs. limit=15.0 2023-09-29 02:21:01,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 02:21:07,263 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.60 vs. limit=12.0 2023-09-29 02:21:07,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:10,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:21:11,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 02:21:11,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=218146.66666666666, ans=0.0 2023-09-29 02:21:12,924 INFO [train.py:1039] (2/4) Epoch 7, batch 850, loss[loss=0.2245, simple_loss=0.2833, pruned_loss=0.08281, over 23422.00 frames. ], tot_loss[loss=0.2299, simple_loss=0.2929, pruned_loss=0.08344, over 4631553.95 frames. ], batch size: 119, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:21:13,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:21:15,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:15,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=218146.66666666666, ans=0.0 2023-09-29 02:21:16,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 02:21:16,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:18,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:21:18,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=218146.66666666666, ans=0.125 2023-09-29 02:21:19,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:19,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:21:20,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=218146.66666666666, ans=0.2 2023-09-29 02:21:21,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:21:22,337 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.34 vs. limit=10.0 2023-09-29 02:21:23,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 02:21:23,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 02:21:23,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 02:21:24,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:26,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:21:29,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:29,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:29,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:21:32,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:33,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:21:33,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 02:21:38,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 02:21:40,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:42,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 02:21:45,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 02:21:47,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 02:21:49,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=218280.0, ans=0.0 2023-09-29 02:21:51,178 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 02:21:51,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:21:51,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:21:51,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:21:54,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:55,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:57,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 02:22:00,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:22:00,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:01,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:22:01,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:22:02,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:22:02,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=218346.66666666666, ans=0.125 2023-09-29 02:22:03,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:22:05,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 02:22:09,491 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.250e+02 2.603e+02 3.078e+02 4.971e+02, threshold=5.207e+02, percent-clipped=2.0 2023-09-29 02:22:09,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:22:09,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:11,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:22:11,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:11,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=218346.66666666666, ans=0.05 2023-09-29 02:22:12,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:16,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:22:17,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=218413.33333333334, ans=0.1 2023-09-29 02:22:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:22:20,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:22:20,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:20,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:22:28,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:22:28,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=218413.33333333334, ans=0.125 2023-09-29 02:22:28,789 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.29 vs. limit=15.0 2023-09-29 02:22:29,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:29,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 02:22:31,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:31,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:32,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 02:22:35,496 INFO [train.py:1039] (2/4) Epoch 7, batch 900, loss[loss=0.203, simple_loss=0.2738, pruned_loss=0.06611, over 24640.00 frames. ], tot_loss[loss=0.2321, simple_loss=0.2945, pruned_loss=0.08488, over 4642622.48 frames. ], batch size: 65, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:22:37,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:22:39,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.63 vs. limit=22.5 2023-09-29 02:22:41,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:41,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 02:22:43,249 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:22:44,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:22:45,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 02:22:48,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:22:50,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:50,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:22:50,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:22:51,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:22:55,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=218546.66666666666, ans=0.125 2023-09-29 02:23:01,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:01,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:23:01,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:23:04,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:10,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 02:23:12,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:23:16,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:23:18,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:23:18,520 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 02:23:18,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 02:23:26,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:23:26,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:23:26,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=218680.0, ans=0.125 2023-09-29 02:23:28,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:23:35,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:35,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:23:37,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 02:23:37,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:40,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 02:23:43,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:23:43,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:45,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:23:45,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:23:46,360 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.26 vs. limit=10.0 2023-09-29 02:23:48,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 02:23:50,132 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 02:23:50,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=218746.66666666666, ans=0.125 2023-09-29 02:23:51,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:23:51,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 02:23:53,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:56,775 INFO [train.py:1039] (2/4) Epoch 7, batch 950, loss[loss=0.233, simple_loss=0.2843, pruned_loss=0.09089, over 23672.00 frames. ], tot_loss[loss=0.2321, simple_loss=0.2943, pruned_loss=0.08497, over 4648703.82 frames. ], batch size: 232, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:23:58,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 02:24:03,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:05,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:05,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:06,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:24:08,431 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 02:24:12,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:13,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:14,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:15,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:24:15,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 02:24:16,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:24:18,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:19,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 02:24:19,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:24,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:24:25,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 02:24:27,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 02:24:31,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:32,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:24:39,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:24:39,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:39,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.67 vs. limit=15.0 2023-09-29 02:24:41,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 02:24:43,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:24:43,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:24:44,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:44,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:44,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:24:50,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 02:24:50,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:24:53,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.008e+02 2.208e+02 2.603e+02 6.954e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 02:24:54,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:54,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:54,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 02:24:54,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:54,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:24:56,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 02:24:59,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:25:01,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:25:08,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:08,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 02:25:08,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.42 vs. limit=15.0 2023-09-29 02:25:09,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 02:25:11,854 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:25:13,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:25:15,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=219080.0, ans=0.0 2023-09-29 02:25:17,791 INFO [train.py:1039] (2/4) Epoch 7, batch 1000, loss[loss=0.2014, simple_loss=0.2332, pruned_loss=0.08477, over 19291.00 frames. ], tot_loss[loss=0.2295, simple_loss=0.292, pruned_loss=0.08355, over 4672407.63 frames. ], batch size: 388, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:25:19,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 02:25:19,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:22,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:25:25,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 02:25:25,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 02:25:26,543 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.59 vs. limit=15.0 2023-09-29 02:25:30,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:30,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:32,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:35,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 02:25:39,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 02:25:41,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 02:25:43,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:25:44,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 02:25:46,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 02:25:46,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 02:25:48,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:49,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:57,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:58,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:25:59,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:59,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:59,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 02:25:59,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:01,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:26:01,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:26:02,565 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 02:26:06,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 02:26:06,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 02:26:07,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 02:26:11,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:26:17,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=219346.66666666666, ans=0.0 2023-09-29 02:26:18,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:18,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:26:20,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:21,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:26:21,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 02:26:23,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=219413.33333333334, ans=0.05 2023-09-29 02:26:24,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:26:24,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 02:26:24,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 02:26:27,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:26:27,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:26:28,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=219413.33333333334, ans=0.125 2023-09-29 02:26:29,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:26:32,151 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.06 vs. limit=15.0 2023-09-29 02:26:34,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:26:37,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:40,868 INFO [train.py:1039] (2/4) Epoch 7, batch 1050, loss[loss=0.2284, simple_loss=0.2939, pruned_loss=0.08142, over 23308.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.29, pruned_loss=0.08275, over 4665362.51 frames. ], batch size: 105, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:26:40,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:26:41,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:26:42,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:26:44,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:46,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:26:49,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:26:50,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:26:53,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:26:54,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:26:54,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:26:56,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:26:56,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 02:26:57,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:26:57,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 02:26:58,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=219546.66666666666, ans=0.125 2023-09-29 02:27:00,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:27:00,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 02:27:02,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:27:07,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=219546.66666666666, ans=0.125 2023-09-29 02:27:09,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:27:10,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:27:10,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:27:14,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 02:27:14,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 02:27:15,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:27:19,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 02:27:21,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 02:27:22,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:25,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-09-29 02:27:26,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:27:28,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:27:29,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:27:31,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:27:34,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:27:34,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=219680.0, ans=0.125 2023-09-29 02:27:37,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 02:27:39,290 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.034e+02 2.317e+02 2.689e+02 3.658e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 02:27:39,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 02:27:39,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 02:27:39,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:40,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:27:42,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 02:27:47,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:27:49,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:49,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:27:49,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:49,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:53,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=219746.66666666666, ans=0.125 2023-09-29 02:27:55,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:55,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 02:27:56,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:56,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 02:27:56,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 02:27:58,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:27:59,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=219746.66666666666, ans=0.95 2023-09-29 02:28:00,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:03,714 INFO [train.py:1039] (2/4) Epoch 7, batch 1100, loss[loss=0.2263, simple_loss=0.3048, pruned_loss=0.07392, over 24344.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.2899, pruned_loss=0.08225, over 4682683.49 frames. ], batch size: 74, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:28:04,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=219813.33333333334, ans=0.0 2023-09-29 02:28:04,709 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.39 vs. limit=15.0 2023-09-29 02:28:05,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:28:07,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=219813.33333333334, ans=0.0 2023-09-29 02:28:10,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:28:10,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:28:11,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:11,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 02:28:15,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:28:17,721 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.38 vs. limit=15.0 2023-09-29 02:28:18,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:28:19,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=219880.0, ans=0.125 2023-09-29 02:28:20,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:28:25,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:28:25,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 02:28:25,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:28:27,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:27,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:28:32,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:28:33,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:28:40,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:28:43,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 02:28:44,514 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 02:28:44,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:47,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:49,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:28:49,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:50,290 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.14 vs. limit=10.0 2023-09-29 02:28:51,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 02:28:51,821 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-09-29 02:28:52,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:28:52,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:28:52,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:28:53,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=220013.33333333334, ans=0.125 2023-09-29 02:28:54,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:54,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=220013.33333333334, ans=0.1 2023-09-29 02:28:55,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 02:28:56,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=220013.33333333334, ans=0.2 2023-09-29 02:29:00,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:29:01,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 02:29:03,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:29:06,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:29:10,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 02:29:10,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:29:11,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:12,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=220080.0, ans=0.0 2023-09-29 02:29:13,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:14,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:16,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 02:29:16,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:29:16,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:18,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 02:29:18,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:29:19,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 02:29:21,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:29:21,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:29:22,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:29:26,263 INFO [train.py:1039] (2/4) Epoch 7, batch 1150, loss[loss=0.3529, simple_loss=0.3706, pruned_loss=0.1676, over 19663.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.2909, pruned_loss=0.08268, over 4695497.22 frames. ], batch size: 388, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:29:28,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:30,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:29:33,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:34,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:29:35,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 02:29:35,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:38,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 02:29:38,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:38,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:29:45,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 02:29:48,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:51,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:52,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:29:53,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 02:29:53,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:29:53,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:59,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 02:29:59,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:59,884 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.50 vs. limit=15.0 2023-09-29 02:30:00,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:30:13,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:18,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=220346.66666666666, ans=0.1 2023-09-29 02:30:21,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:21,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 02:30:22,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:24,042 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 2.131e+02 2.399e+02 2.911e+02 4.367e+02, threshold=4.797e+02, percent-clipped=0.0 2023-09-29 02:30:24,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:27,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=220346.66666666666, ans=0.2 2023-09-29 02:30:30,144 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 02:30:30,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:30,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=220413.33333333334, ans=0.2 2023-09-29 02:30:36,538 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 02:30:42,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:43,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:30:43,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:30:43,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:30:46,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:30:48,882 INFO [train.py:1039] (2/4) Epoch 7, batch 1200, loss[loss=0.2558, simple_loss=0.2991, pruned_loss=0.1062, over 23426.00 frames. ], tot_loss[loss=0.2291, simple_loss=0.2922, pruned_loss=0.08301, over 4700285.01 frames. ], batch size: 285, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:30:50,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:30:50,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:30:54,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:30:54,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:55,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:30:57,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:30:58,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:31:00,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:00,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 02:31:06,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 02:31:10,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:31:13,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:31:13,687 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.94 vs. limit=15.0 2023-09-29 02:31:16,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:18,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:31:18,905 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 02:31:19,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:27,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:31:27,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:31:27,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 02:31:29,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:31:32,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 02:31:38,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 02:31:38,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:38,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:39,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:40,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:31:41,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:41,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:31:41,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:31:43,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 02:31:43,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:31:43,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:31:44,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:31:49,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:49,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:54,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:31:57,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:32:00,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 02:32:03,838 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 02:32:05,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:08,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:32:10,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:32:11,611 INFO [train.py:1039] (2/4) Epoch 7, batch 1250, loss[loss=0.2227, simple_loss=0.2893, pruned_loss=0.07807, over 21501.00 frames. ], tot_loss[loss=0.2295, simple_loss=0.2928, pruned_loss=0.08309, over 4713062.81 frames. ], batch size: 47, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:32:11,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:32:14,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 02:32:17,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:32:18,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=220813.33333333334, ans=0.125 2023-09-29 02:32:19,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:19,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 02:32:22,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:32:24,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:32:27,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:32:28,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:29,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:32:29,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:31,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=220880.0, ans=0.125 2023-09-29 02:32:32,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:32:35,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 02:32:35,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:32:35,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:36,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:38,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:41,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:32:41,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:32:43,646 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.32 vs. limit=10.0 2023-09-29 02:32:46,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 02:32:46,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:32:49,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:32:50,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 02:32:52,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:52,891 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 02:32:52,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:54,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:55,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-09-29 02:32:57,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:33:04,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:33:04,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:33:04,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 02:33:05,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 02:33:05,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 02:33:08,742 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.040e+02 2.260e+02 2.662e+02 4.055e+02, threshold=4.521e+02, percent-clipped=0.0 2023-09-29 02:33:09,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=221013.33333333334, ans=0.0 2023-09-29 02:33:10,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:12,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 02:33:12,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:15,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:33:15,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:33:18,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 02:33:18,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:33:20,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:33:20,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:33:20,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:33:22,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 02:33:24,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:26,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:33:27,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:33:30,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:33:33,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:33,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 02:33:34,496 INFO [train.py:1039] (2/4) Epoch 7, batch 1300, loss[loss=0.2343, simple_loss=0.2974, pruned_loss=0.08562, over 23596.00 frames. ], tot_loss[loss=0.2291, simple_loss=0.2929, pruned_loss=0.08266, over 4719448.47 frames. ], batch size: 149, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:33:35,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=221146.66666666666, ans=0.125 2023-09-29 02:33:39,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:40,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:33:42,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:33:43,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:45,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:33:47,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 02:33:50,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:33:52,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:33:53,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 02:33:58,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:34:03,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:05,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:07,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:34:07,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=221280.0, ans=0.2 2023-09-29 02:34:07,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=221280.0, ans=0.025 2023-09-29 02:34:08,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:10,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:34:10,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:34:10,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 02:34:18,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:34:18,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:34:18,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 02:34:20,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:34:22,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:34:24,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=221346.66666666666, ans=0.125 2023-09-29 02:34:25,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:34:26,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=221346.66666666666, ans=0.125 2023-09-29 02:34:27,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 02:34:27,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:27,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 02:34:29,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:33,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:33,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:34:37,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 02:34:39,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 02:34:40,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 02:34:45,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:34:47,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=221413.33333333334, ans=0.05 2023-09-29 02:34:49,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 02:34:50,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:54,701 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.31 vs. limit=15.0 2023-09-29 02:34:57,755 INFO [train.py:1039] (2/4) Epoch 7, batch 1350, loss[loss=0.2149, simple_loss=0.2902, pruned_loss=0.06977, over 24645.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2918, pruned_loss=0.08208, over 4718112.39 frames. ], batch size: 73, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:34:58,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 02:35:02,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:04,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:07,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:35:08,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:10,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:35:10,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:15,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:17,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 02:35:18,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:20,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:35:22,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 02:35:24,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:35:25,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:35:25,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 02:35:27,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 02:35:29,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 02:35:31,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:31,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 02:35:42,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:42,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=221613.33333333334, ans=0.125 2023-09-29 02:35:47,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=221680.0, ans=0.0 2023-09-29 02:35:51,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:51,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:52,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 02:35:55,566 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.214e+02 2.683e+02 3.104e+02 4.290e+02, threshold=5.366e+02, percent-clipped=0.0 2023-09-29 02:35:55,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:57,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 02:35:57,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:58,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=221680.0, ans=0.0 2023-09-29 02:35:59,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:36:01,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:36:03,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 02:36:04,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:36:07,600 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.68 vs. limit=15.0 2023-09-29 02:36:08,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 02:36:09,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 02:36:17,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 02:36:19,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:36:20,062 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.30 vs. limit=6.0 2023-09-29 02:36:20,678 INFO [train.py:1039] (2/4) Epoch 7, batch 1400, loss[loss=0.2052, simple_loss=0.2702, pruned_loss=0.07005, over 20652.00 frames. ], tot_loss[loss=0.2275, simple_loss=0.2909, pruned_loss=0.08207, over 4709245.17 frames. ], batch size: 45, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:36:21,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=221813.33333333334, ans=0.0 2023-09-29 02:36:23,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:36:23,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:36:24,951 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:36:27,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 02:36:29,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=221813.33333333334, ans=0.125 2023-09-29 02:36:31,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 02:36:37,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=221880.0, ans=0.95 2023-09-29 02:36:39,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:36:41,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:36:44,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:36:44,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:36:46,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=221880.0, ans=0.05 2023-09-29 02:36:49,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:36:50,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:37:00,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:02,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:03,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=221946.66666666666, ans=0.125 2023-09-29 02:37:03,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=221946.66666666666, ans=0.2 2023-09-29 02:37:07,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 02:37:09,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:37:09,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:37:09,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:37:11,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:12,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:37:12,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:37:14,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:37:15,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 02:37:17,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:37:20,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:23,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:37:25,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.32 vs. limit=15.0 2023-09-29 02:37:27,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=222080.0, ans=0.0 2023-09-29 02:37:32,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 02:37:33,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:37:35,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:37:36,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:37:38,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:38,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:37:43,600 INFO [train.py:1039] (2/4) Epoch 7, batch 1450, loss[loss=0.2659, simple_loss=0.2934, pruned_loss=0.1192, over 19227.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2904, pruned_loss=0.08124, over 4701491.31 frames. ], batch size: 388, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:37:43,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:37:46,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:37:46,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:46,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:37:51,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:53,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:37:54,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:54,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 02:37:56,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:37:56,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 02:37:58,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:59,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:37:59,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 02:37:59,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:37:59,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:38:00,213 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:38:01,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 02:38:01,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:01,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:38:04,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:04,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=222213.33333333334, ans=0.1 2023-09-29 02:38:05,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=222213.33333333334, ans=0.2 2023-09-29 02:38:07,369 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.23 vs. limit=22.5 2023-09-29 02:38:08,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:11,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:38:11,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:38:14,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:38:14,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:18,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:18,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:38:18,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:20,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:23,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 02:38:25,680 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.01 vs. limit=15.0 2023-09-29 02:38:26,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:38:29,605 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 02:38:29,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=222280.0, ans=0.1 2023-09-29 02:38:32,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:32,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:38:34,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:36,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 02:38:38,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=222346.66666666666, ans=0.0 2023-09-29 02:38:39,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:41,367 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 2.076e+02 2.226e+02 2.557e+02 3.542e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 02:38:41,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 02:38:43,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 02:38:44,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:47,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:38:49,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:51,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 02:38:51,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=222413.33333333334, ans=0.0 2023-09-29 02:38:53,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 02:38:53,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 02:38:55,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:56,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:39:02,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.96 vs. limit=22.5 2023-09-29 02:39:06,109 INFO [train.py:1039] (2/4) Epoch 7, batch 1500, loss[loss=0.1928, simple_loss=0.2588, pruned_loss=0.06334, over 20680.00 frames. ], tot_loss[loss=0.2277, simple_loss=0.291, pruned_loss=0.08226, over 4691880.85 frames. ], batch size: 45, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:39:09,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 02:39:09,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:39:09,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:39:11,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:11,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:13,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:39:14,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 02:39:16,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:39:16,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:39:16,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:18,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:39:19,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:39:21,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:25,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:27,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 02:39:27,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:39:27,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:39:29,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:29,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=222546.66666666666, ans=15.0 2023-09-29 02:39:32,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 02:39:33,137 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=15.0 2023-09-29 02:39:35,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 02:39:38,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:38,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 02:39:41,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:39:43,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:39:45,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:45,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:39:45,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 02:39:46,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:39:46,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:48,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 02:39:50,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:50,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=222613.33333333334, ans=0.125 2023-09-29 02:39:55,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=222680.0, ans=0.0 2023-09-29 02:39:56,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:39:56,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 02:40:03,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:40:05,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:40:09,835 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 02:40:09,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:09,923 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 02:40:11,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:12,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:13,108 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 02:40:14,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:40:19,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 02:40:21,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:24,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:24,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:24,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:25,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:26,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:40:27,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 02:40:28,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 02:40:29,316 INFO [train.py:1039] (2/4) Epoch 7, batch 1550, loss[loss=0.2377, simple_loss=0.302, pruned_loss=0.08669, over 23893.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.2917, pruned_loss=0.08195, over 4702089.25 frames. ], batch size: 86, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:40:29,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:40:30,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 02:40:30,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 02:40:32,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:36,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:36,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:40:36,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:40:37,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:39,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:40,935 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 02:40:42,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:42,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:40:43,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:40:45,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:40:45,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 02:40:47,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:48,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 02:40:48,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 02:40:48,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 02:40:50,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:51,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:40:56,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:56,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=222880.0, ans=0.125 2023-09-29 02:40:59,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 02:40:59,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 02:41:07,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=15.49 vs. limit=15.0 2023-09-29 02:41:07,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:13,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:41:13,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:41:13,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:41:13,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 02:41:19,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:41:22,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:23,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:41:24,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:41:25,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:25,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 02:41:25,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:27,790 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.115e+02 2.414e+02 2.837e+02 4.599e+02, threshold=4.828e+02, percent-clipped=1.0 2023-09-29 02:41:29,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:41:29,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:30,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:41:30,871 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 02:41:31,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=223013.33333333334, ans=0.1 2023-09-29 02:41:33,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:40,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 02:41:45,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:45,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:46,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.28 vs. limit=10.0 2023-09-29 02:41:47,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 02:41:48,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:50,272 INFO [train.py:1039] (2/4) Epoch 7, batch 1600, loss[loss=0.2305, simple_loss=0.3063, pruned_loss=0.07738, over 24452.00 frames. ], tot_loss[loss=0.2284, simple_loss=0.2927, pruned_loss=0.082, over 4714669.00 frames. ], batch size: 69, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:41:50,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:50,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:41:50,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:41:51,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:41:53,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=223146.66666666666, ans=0.125 2023-09-29 02:41:54,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:55,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 02:41:56,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 02:41:58,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 02:41:59,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:03,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 02:42:03,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:42:05,813 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.66 vs. limit=10.0 2023-09-29 02:42:06,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:42:11,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:42:15,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 02:42:17,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:42:19,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 02:42:19,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:19,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 02:42:26,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=223280.0, ans=0.125 2023-09-29 02:42:27,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 02:42:32,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=22.5 2023-09-29 02:42:33,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:35,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 02:42:35,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:37,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:37,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:42:40,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 02:42:43,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 02:42:45,588 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.50 vs. limit=15.0 2023-09-29 02:42:46,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:42:46,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:46,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:48,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:42:50,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:42:51,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:42:53,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:43:00,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:02,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:02,943 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.55 vs. limit=15.0 2023-09-29 02:43:05,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 02:43:05,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:43:06,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 02:43:09,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=223413.33333333334, ans=0.125 2023-09-29 02:43:11,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:13,072 INFO [train.py:1039] (2/4) Epoch 7, batch 1650, loss[loss=0.2178, simple_loss=0.2848, pruned_loss=0.07536, over 22552.00 frames. ], tot_loss[loss=0.2283, simple_loss=0.293, pruned_loss=0.08184, over 4717057.23 frames. ], batch size: 49, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:43:14,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:43:14,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:43:14,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 02:43:14,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 02:43:14,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 02:43:14,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 02:43:15,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=223480.0, ans=0.125 2023-09-29 02:43:19,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:19,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=223480.0, ans=0.125 2023-09-29 02:43:21,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:22,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:43:22,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:43:26,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:27,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 02:43:30,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:43:30,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:30,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:43:30,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:43:32,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 02:43:33,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 02:43:39,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:43:41,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:43:48,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 02:43:50,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:43:51,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 02:43:56,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:43:59,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:43:59,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:59,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:00,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=223613.33333333334, ans=0.125 2023-09-29 02:44:01,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:44:01,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:05,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:05,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:07,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:07,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:08,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:08,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=223680.0, ans=0.125 2023-09-29 02:44:10,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:44:13,134 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.058e+02 2.405e+02 2.744e+02 4.179e+02, threshold=4.810e+02, percent-clipped=0.0 2023-09-29 02:44:13,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:13,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 02:44:16,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:18,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 02:44:18,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 02:44:18,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 02:44:19,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:21,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:44:21,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:22,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:22,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 02:44:26,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:27,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:44:27,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:29,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 02:44:34,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:34,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:44:35,910 INFO [train.py:1039] (2/4) Epoch 7, batch 1700, loss[loss=0.2518, simple_loss=0.2974, pruned_loss=0.103, over 23721.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.2918, pruned_loss=0.08135, over 4711763.63 frames. ], batch size: 179, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:44:36,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 02:44:37,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:44:37,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:44:37,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:39,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:44:39,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:44:39,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 02:44:40,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.33 vs. limit=10.0 2023-09-29 02:44:42,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:44:51,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:52,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:44:52,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=223880.0, ans=0.125 2023-09-29 02:44:58,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:44:58,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:44:59,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:45:00,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:03,479 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.01 vs. limit=15.0 2023-09-29 02:45:05,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 02:45:07,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:45:07,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:08,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:45:10,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:45:12,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 02:45:14,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 02:45:15,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:16,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 02:45:17,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:45:24,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=224013.33333333334, ans=0.025 2023-09-29 02:45:27,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:28,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:30,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:45:33,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:45:33,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 02:45:33,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:35,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:35,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 02:45:36,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:45:36,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:36,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:36,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:45:40,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=224080.0, ans=0.0 2023-09-29 02:45:41,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:41,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:45:41,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:41,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:45:43,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:45,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=224080.0, ans=0.015 2023-09-29 02:45:48,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:50,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 02:45:52,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:52,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:55,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 02:45:58,956 INFO [train.py:1039] (2/4) Epoch 7, batch 1750, loss[loss=0.2492, simple_loss=0.303, pruned_loss=0.09771, over 23832.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2905, pruned_loss=0.08125, over 4707949.75 frames. ], batch size: 212, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:46:01,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:03,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:05,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:46:05,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 02:46:06,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:46:09,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:46:09,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:13,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=224213.33333333334, ans=0.125 2023-09-29 02:46:14,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 02:46:17,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:19,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 02:46:21,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:21,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:46:25,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:46:26,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 02:46:27,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:46:28,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 02:46:36,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:46:41,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:46:41,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:44,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:44,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:46,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:46:48,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:49,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=224346.66666666666, ans=10.0 2023-09-29 02:46:51,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:52,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:52,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 02:46:55,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:58,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 02:46:58,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:46:59,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.38 vs. limit=15.0 2023-09-29 02:47:00,211 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 2.137e+02 2.415e+02 2.786e+02 3.944e+02, threshold=4.830e+02, percent-clipped=0.0 2023-09-29 02:47:00,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:01,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:47:02,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=224346.66666666666, ans=15.0 2023-09-29 02:47:05,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:47:06,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:47:08,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:09,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:47:13,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:17,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:17,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:47:19,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 02:47:19,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:21,253 INFO [train.py:1039] (2/4) Epoch 7, batch 1800, loss[loss=0.2415, simple_loss=0.32, pruned_loss=0.08151, over 24653.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.2901, pruned_loss=0.08075, over 4710955.51 frames. ], batch size: 68, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:47:21,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:47:21,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:21,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:47:21,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:47:22,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:47:24,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:47:26,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:27,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:47:29,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:33,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:47:35,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:47:38,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:38,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=224546.66666666666, ans=0.125 2023-09-29 02:47:40,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=224546.66666666666, ans=0.0 2023-09-29 02:47:41,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:41,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:43,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:47:46,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:46,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 02:47:46,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:48,160 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:47:49,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:53,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 02:47:55,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 02:47:55,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 02:47:56,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:57,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=24.59 vs. limit=22.5 2023-09-29 02:47:57,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:57,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:59,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:48:06,744 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 02:48:08,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:48:11,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:12,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 02:48:14,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 02:48:15,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:48:16,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:48:16,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:48:22,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 02:48:26,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:48:28,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 02:48:28,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:48:28,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:28,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=224746.66666666666, ans=0.2 2023-09-29 02:48:29,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:48:29,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 02:48:30,698 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.82 vs. limit=15.0 2023-09-29 02:48:32,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:48:32,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:48:33,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=224746.66666666666, ans=0.125 2023-09-29 02:48:34,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=224746.66666666666, ans=0.125 2023-09-29 02:48:35,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 02:48:35,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:38,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:38,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:48:38,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:48:45,580 INFO [train.py:1039] (2/4) Epoch 7, batch 1850, loss[loss=0.2353, simple_loss=0.3103, pruned_loss=0.08019, over 24306.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2906, pruned_loss=0.08118, over 4706474.76 frames. ], batch size: 74, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:48:45,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:48:45,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:47,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:48:48,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:48:55,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:48:55,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=224813.33333333334, ans=0.1 2023-09-29 02:48:56,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 02:48:59,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 02:49:00,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=224880.0, ans=0.0 2023-09-29 02:49:02,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 02:49:06,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:07,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 02:49:07,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:49:19,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:49:21,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 02:49:25,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:49:25,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:49:28,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 02:49:28,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:29,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:49:31,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:49:33,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:49:36,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:49:40,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:49:41,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:41,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:49:41,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:44,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:46,007 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.062e+02 2.233e+02 2.588e+02 4.432e+02, threshold=4.466e+02, percent-clipped=0.0 2023-09-29 02:49:46,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:49:50,601 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.54 vs. limit=12.0 2023-09-29 02:49:51,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 02:49:51,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:55,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:49:57,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:49:57,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 02:49:57,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 02:49:58,690 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 02:50:02,042 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 02:50:03,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:50:03,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:50:03,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:03,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,121 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 02:50:05,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:50:05,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:50:08,232 INFO [train.py:1039] (2/4) Epoch 7, batch 1900, loss[loss=0.195, simple_loss=0.2688, pruned_loss=0.06064, over 24266.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2922, pruned_loss=0.08183, over 4704863.16 frames. ], batch size: 61, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:50:08,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:50:09,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:50:09,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 02:50:13,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:13,461 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 02:50:13,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:50:14,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:21,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:24,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:50:24,810 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 02:50:25,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=225213.33333333334, ans=0.0 2023-09-29 02:50:25,608 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.45 vs. limit=15.0 2023-09-29 02:50:26,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 02:50:27,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:27,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:50:27,941 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 02:50:29,503 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 02:50:29,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=225213.33333333334, ans=0.05 2023-09-29 02:50:33,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 02:50:36,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:50:39,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 02:50:42,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 02:50:54,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 02:50:55,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 02:50:55,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:57,384 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 02:50:57,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 02:50:57,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 02:50:57,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 02:50:57,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:02,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 02:51:05,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:51:08,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:08,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 02:51:12,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:51:15,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 02:51:17,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:22,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:51:22,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:51:22,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:51:24,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:51:26,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:51:27,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:51:27,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:51:30,595 INFO [train.py:1039] (2/4) Epoch 7, batch 1950, loss[loss=0.2303, simple_loss=0.2948, pruned_loss=0.08291, over 23501.00 frames. ], tot_loss[loss=0.2287, simple_loss=0.2932, pruned_loss=0.08212, over 4711768.56 frames. ], batch size: 134, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:51:30,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:30,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:51:31,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=225480.0, ans=0.2 2023-09-29 02:51:34,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:51:34,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:34,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:35,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:40,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:42,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:51:42,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=225480.0, ans=0.125 2023-09-29 02:51:44,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:51:45,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 02:51:45,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:51:47,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:47,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:50,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:51:50,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:51:50,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:52,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=225546.66666666666, ans=0.0 2023-09-29 02:51:53,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:51:55,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:56,746 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.51 vs. limit=22.5 2023-09-29 02:51:57,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:51:57,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:51:57,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:01,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:52:05,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:05,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:52:05,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 02:52:07,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:52:07,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:52:07,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:12,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:13,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:52:19,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:52:20,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:52:20,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:52:22,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 02:52:23,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:52:27,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:52:27,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=225680.0, ans=0.125 2023-09-29 02:52:28,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:52:30,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:32,065 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.356e+02 2.885e+02 3.538e+02 5.665e+02, threshold=5.770e+02, percent-clipped=6.0 2023-09-29 02:52:37,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:38,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:42,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:44,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:47,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:52:47,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:49,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 02:52:49,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:52:49,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:51,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 02:52:51,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=225746.66666666666, ans=0.0 2023-09-29 02:52:53,917 INFO [train.py:1039] (2/4) Epoch 7, batch 2000, loss[loss=0.2028, simple_loss=0.2603, pruned_loss=0.07269, over 23628.00 frames. ], tot_loss[loss=0.2284, simple_loss=0.2928, pruned_loss=0.08201, over 4720459.26 frames. ], batch size: 135, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:52:53,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:52:57,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:58,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:52:58,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:01,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:53:03,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:06,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 02:53:09,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:53:10,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:53:13,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 02:53:15,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:53:15,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:53:17,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:53:19,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 02:53:22,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:25,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 02:53:25,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:53:28,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 02:53:28,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:31,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:53:31,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:53:31,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:33,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:34,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:36,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 02:53:37,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 02:53:37,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:37,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:53:40,937 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.40 vs. limit=10.0 2023-09-29 02:53:43,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:45,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:53:45,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:47,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:49,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:49,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:49,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=226013.33333333334, ans=0.2 2023-09-29 02:53:50,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:50,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:52,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:52,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=226013.33333333334, ans=0.1 2023-09-29 02:53:52,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=226013.33333333334, ans=0.0 2023-09-29 02:53:56,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:56,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 02:54:02,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:54:03,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:07,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=226080.0, ans=0.05 2023-09-29 02:54:08,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:08,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:54:10,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=226080.0, ans=0.2 2023-09-29 02:54:11,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:14,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:14,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:15,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:54:16,507 INFO [train.py:1039] (2/4) Epoch 7, batch 2050, loss[loss=0.2473, simple_loss=0.3164, pruned_loss=0.08907, over 24392.00 frames. ], tot_loss[loss=0.2287, simple_loss=0.293, pruned_loss=0.08224, over 4714852.66 frames. ], batch size: 77, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 02:54:16,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:54:19,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:20,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:23,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:23,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:30,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:54:30,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=226146.66666666666, ans=0.2 2023-09-29 02:54:31,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:54:31,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:33,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:54:35,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 02:54:35,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:54:36,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:54:38,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:54:47,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:47,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:51,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 02:54:53,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:53,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=226280.0, ans=0.2 2023-09-29 02:54:54,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 02:54:54,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:56,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:55:00,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:00,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:55:00,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:55:02,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:55:02,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:55:04,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:55:07,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:10,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:55:12,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:55:14,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:55:19,285 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.164e+02 2.389e+02 2.987e+02 5.025e+02, threshold=4.777e+02, percent-clipped=0.0 2023-09-29 02:55:19,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:26,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:55:28,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 02:55:33,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:33,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:55:35,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=226413.33333333334, ans=0.125 2023-09-29 02:55:36,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:55:37,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=14.59 vs. limit=15.0 2023-09-29 02:55:38,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 02:55:40,153 INFO [train.py:1039] (2/4) Epoch 7, batch 2100, loss[loss=0.2338, simple_loss=0.3038, pruned_loss=0.08195, over 24365.00 frames. ], tot_loss[loss=0.2275, simple_loss=0.2914, pruned_loss=0.08179, over 4714788.75 frames. ], batch size: 77, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:55:41,906 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 02:55:41,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:41,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:43,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:55:43,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:43,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 02:55:43,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 02:55:46,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:49,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:55:49,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:55:50,010 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:55:50,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=226480.0, ans=0.125 2023-09-29 02:55:52,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:54,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:55:54,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 02:55:56,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:55:56,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 02:55:56,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 02:55:58,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:55:59,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:55:59,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 02:56:00,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:56:01,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=226546.66666666666, ans=0.125 2023-09-29 02:56:03,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=226546.66666666666, ans=0.125 2023-09-29 02:56:05,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 02:56:05,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:56:08,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:56:08,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:56:13,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:56:13,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 02:56:14,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:14,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:56:16,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=226613.33333333334, ans=0.0 2023-09-29 02:56:17,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 02:56:17,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:17,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 02:56:17,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 02:56:19,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 02:56:22,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:56:22,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=226613.33333333334, ans=0.125 2023-09-29 02:56:23,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:56:27,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:28,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:30,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:32,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:32,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 02:56:32,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:32,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:34,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:34,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 02:56:36,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 02:56:36,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 02:56:42,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:56:42,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=226680.0, ans=0.2 2023-09-29 02:56:44,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=226746.66666666666, ans=0.125 2023-09-29 02:56:44,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=226746.66666666666, ans=0.0 2023-09-29 02:56:45,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:56:47,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 02:56:52,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:55,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:56:56,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:56:56,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:56:56,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:56:56,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:56:58,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:58,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:56:59,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:56:59,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:01,355 INFO [train.py:1039] (2/4) Epoch 7, batch 2150, loss[loss=0.2006, simple_loss=0.2773, pruned_loss=0.06191, over 24476.00 frames. ], tot_loss[loss=0.2257, simple_loss=0.2897, pruned_loss=0.08092, over 4711448.07 frames. ], batch size: 63, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:57:01,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=226813.33333333334, ans=0.2 2023-09-29 02:57:02,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 02:57:04,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 02:57:04,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:06,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=226813.33333333334, ans=0.125 2023-09-29 02:57:07,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:57:07,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:57:07,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:57:07,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:57:12,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:57:15,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:15,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:18,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:57:18,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:20,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:57:23,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:23,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:57:23,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:57:23,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=226880.0, ans=0.2 2023-09-29 02:57:23,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=226880.0, ans=0.125 2023-09-29 02:57:26,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:27,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 02:57:32,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:34,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:57:35,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:35,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:35,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:36,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:57:37,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:37,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:57:37,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=226946.66666666666, ans=0.125 2023-09-29 02:57:38,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:57:41,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 02:57:42,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:57:44,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:44,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:46,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:57:46,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:57:48,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=226946.66666666666, ans=0.125 2023-09-29 02:57:49,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:49,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:57:51,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:51,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 02:57:53,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:57:56,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:56,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:58,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:58:01,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:58:01,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:01,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:01,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 02:58:04,438 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.076e+02 2.402e+02 2.714e+02 3.938e+02, threshold=4.805e+02, percent-clipped=0.0 2023-09-29 02:58:04,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 02:58:04,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:58:05,363 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.78 vs. limit=15.0 2023-09-29 02:58:05,996 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 02:58:06,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:06,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:58:06,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=227080.0, ans=0.0 2023-09-29 02:58:07,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 02:58:07,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:58:07,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 02:58:07,734 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 02:58:07,735 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 02:58:09,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 02:58:09,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=227080.0, ans=0.0 2023-09-29 02:58:10,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:12,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:58:12,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:58:12,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:13,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:58:15,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:15,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:18,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=227080.0, ans=0.125 2023-09-29 02:58:22,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:58:24,649 INFO [train.py:1039] (2/4) Epoch 7, batch 2200, loss[loss=0.2473, simple_loss=0.28, pruned_loss=0.1073, over 19241.00 frames. ], tot_loss[loss=0.2251, simple_loss=0.2892, pruned_loss=0.08054, over 4707898.07 frames. ], batch size: 388, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:58:24,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 02:58:29,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:58:34,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:36,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:58:36,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:58:37,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:58:39,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:39,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:58:39,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 02:58:41,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=227213.33333333334, ans=0.1 2023-09-29 02:58:44,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 02:58:45,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:58:53,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 02:58:58,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:58,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:58:58,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:59:03,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:59:03,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 02:59:05,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:59:08,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:08,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:59:11,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:59:11,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=227280.0, ans=0.0 2023-09-29 02:59:14,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:15,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:59:17,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:19,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 02:59:20,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:21,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 02:59:24,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:24,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:59:24,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:27,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:59:28,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:28,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:28,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:30,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:59:31,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:59:33,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:59:38,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:59:38,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:59:40,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:59:40,432 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:59:40,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=227413.33333333334, ans=0.0 2023-09-29 02:59:41,758 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 02:59:43,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:59:43,626 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 02:59:45,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:59:45,172 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 02:59:46,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:48,170 INFO [train.py:1039] (2/4) Epoch 7, batch 2250, loss[loss=0.2475, simple_loss=0.3031, pruned_loss=0.09595, over 23599.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.2899, pruned_loss=0.0809, over 4705762.39 frames. ], batch size: 232, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:59:48,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:59:48,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:51,371 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 02:59:51,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:59:54,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:02,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:00:04,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:00:08,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:10,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:11,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:13,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 03:00:13,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:14,208 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.16 vs. limit=15.0 2023-09-29 03:00:14,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:00:16,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 03:00:17,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:00:17,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:19,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:20,103 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.87 vs. limit=12.0 2023-09-29 03:00:25,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:27,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:00:27,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:00:28,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 03:00:30,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:31,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:00:35,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:37,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:39,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:00:39,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:39,943 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:00:41,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:41,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:00:42,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=227680.0, ans=0.0 2023-09-29 03:00:47,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:00:50,692 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.075e+02 2.318e+02 2.667e+02 5.695e+02, threshold=4.636e+02, percent-clipped=1.0 2023-09-29 03:00:50,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:00:55,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:00:56,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:00:56,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:00:58,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=227746.66666666666, ans=0.125 2023-09-29 03:01:01,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:01:02,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.23 vs. limit=22.5 2023-09-29 03:01:03,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:01:03,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 03:01:04,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:05,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:01:08,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 03:01:09,710 INFO [train.py:1039] (2/4) Epoch 7, batch 2300, loss[loss=0.208, simple_loss=0.2772, pruned_loss=0.06939, over 24600.00 frames. ], tot_loss[loss=0.2268, simple_loss=0.291, pruned_loss=0.08134, over 4702185.62 frames. ], batch size: 60, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:01:13,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:01:13,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:19,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:20,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:01:23,517 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 03:01:25,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:31,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:01:31,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:01:31,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:01:32,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:32,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 03:01:34,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:01:36,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:01:37,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:01:41,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:01:44,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:01:49,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:01:56,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:01:56,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:59,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:02:01,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:04,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:02:06,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:02:06,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:02:06,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 03:02:09,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:02:09,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:11,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:11,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:02:11,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:12,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:02:12,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:02:12,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 03:02:12,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:02:12,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:14,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 03:02:21,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:02:25,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:02:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:30,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:02:30,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:02:32,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:02:32,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:02:32,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=228146.66666666666, ans=0.125 2023-09-29 03:02:33,866 INFO [train.py:1039] (2/4) Epoch 7, batch 2350, loss[loss=0.2463, simple_loss=0.2932, pruned_loss=0.09966, over 23588.00 frames. ], tot_loss[loss=0.2301, simple_loss=0.2936, pruned_loss=0.08331, over 4692335.12 frames. ], batch size: 256, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:02:33,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:02:35,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 03:02:41,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:02:41,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 03:02:43,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=228146.66666666666, ans=0.0 2023-09-29 03:02:48,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 03:02:51,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:54,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:02:54,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:57,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 03:03:01,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:03:08,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 03:03:09,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:03:12,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:03:12,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:03:12,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=228280.0, ans=0.125 2023-09-29 03:03:14,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:03:15,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 03:03:17,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:03:18,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:03:18,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:18,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:03:22,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:03:25,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 03:03:26,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:03:29,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:03:29,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:03:31,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 03:03:31,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:03:34,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 03:03:34,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:03:36,032 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.092e+02 2.385e+02 2.952e+02 4.935e+02, threshold=4.770e+02, percent-clipped=1.0 2023-09-29 03:03:37,112 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.87 vs. limit=15.0 2023-09-29 03:03:39,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 03:03:41,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=228413.33333333334, ans=0.125 2023-09-29 03:03:44,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 03:03:46,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:46,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:03:46,118 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 03:03:46,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 03:03:49,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 03:03:50,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:03:51,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.42 vs. limit=22.5 2023-09-29 03:03:55,179 INFO [train.py:1039] (2/4) Epoch 7, batch 2400, loss[loss=0.2202, simple_loss=0.2814, pruned_loss=0.07954, over 23372.00 frames. ], tot_loss[loss=0.2296, simple_loss=0.2932, pruned_loss=0.08298, over 4701596.43 frames. ], batch size: 119, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 03:03:57,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:03:59,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:04:02,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:04:03,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 03:04:05,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 03:04:11,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=228546.66666666666, ans=0.0 2023-09-29 03:04:12,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:04:12,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:16,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 03:04:16,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:04:17,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:17,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 03:04:22,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.82 vs. limit=22.5 2023-09-29 03:04:24,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:27,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 03:04:31,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:04:34,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 03:04:38,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:04:40,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:44,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:04:46,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 03:04:46,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:04:50,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.60 vs. limit=6.0 2023-09-29 03:04:52,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:04:54,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:04:56,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:58,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:04:58,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:04:59,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:04:59,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:05:00,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:00,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:05:03,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:04,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:05:04,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 03:05:04,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=228746.66666666666, ans=0.125 2023-09-29 03:05:06,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 03:05:06,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=228746.66666666666, ans=0.125 2023-09-29 03:05:07,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=228746.66666666666, ans=0.125 2023-09-29 03:05:09,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:05:09,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:05:11,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 03:05:11,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 03:05:11,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 03:05:12,003 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 03:05:12,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 03:05:13,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:05:16,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:16,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:16,864 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 03:05:18,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:18,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:05:19,807 INFO [train.py:1039] (2/4) Epoch 7, batch 2450, loss[loss=0.2155, simple_loss=0.2652, pruned_loss=0.08294, over 23521.00 frames. ], tot_loss[loss=0.2275, simple_loss=0.2899, pruned_loss=0.08255, over 4678356.35 frames. ], batch size: 285, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:05:23,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:05:24,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:29,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:29,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:29,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 03:05:36,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:05:36,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:39,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=228880.0, ans=0.015 2023-09-29 03:05:40,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:05:40,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:05:40,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:05:41,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 03:05:46,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:47,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:05:49,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:51,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:05:52,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:52,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:54,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:55,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 03:05:57,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:06:00,879 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=14.28 vs. limit=15.0 2023-09-29 03:06:03,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=228946.66666666666, ans=0.0 2023-09-29 03:06:06,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:08,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:06:08,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:09,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:06:10,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:11,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:06:12,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 03:06:15,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:06:17,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:06:20,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:06:20,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:23,657 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 2.159e+02 2.588e+02 3.094e+02 4.619e+02, threshold=5.175e+02, percent-clipped=0.0 2023-09-29 03:06:23,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:06:24,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 03:06:26,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:06:26,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:06:26,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 03:06:28,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:06:29,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:06:32,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:06:36,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:36,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:06:39,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 03:06:41,349 INFO [train.py:1039] (2/4) Epoch 7, batch 2500, loss[loss=0.1971, simple_loss=0.266, pruned_loss=0.0641, over 22437.00 frames. ], tot_loss[loss=0.2274, simple_loss=0.2904, pruned_loss=0.08226, over 4696236.48 frames. ], batch size: 49, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:06:41,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:06:49,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:06:57,298 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:06:59,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.45 vs. limit=10.0 2023-09-29 03:06:59,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:07:01,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:07:01,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:07:01,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 03:07:03,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=15.0 2023-09-29 03:07:09,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:07:09,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:10,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:07:10,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:07:11,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=229213.33333333334, ans=0.125 2023-09-29 03:07:11,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=229213.33333333334, ans=0.125 2023-09-29 03:07:12,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 03:07:13,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:15,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 03:07:15,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 03:07:16,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:22,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:07:24,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:24,621 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:07:27,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:07:29,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 03:07:29,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:07:31,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:34,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:37,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:40,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:07:44,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:07:47,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 03:07:48,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:48,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:07:48,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.17 vs. limit=10.0 2023-09-29 03:07:49,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=229413.33333333334, ans=0.2 2023-09-29 03:07:50,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:07:50,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:07:52,924 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 03:07:52,925 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 03:07:52,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 03:07:56,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:58,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 03:07:58,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 03:07:59,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:08:00,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 03:08:04,195 INFO [train.py:1039] (2/4) Epoch 7, batch 2550, loss[loss=0.2315, simple_loss=0.2944, pruned_loss=0.08434, over 23294.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.2913, pruned_loss=0.08219, over 4701506.64 frames. ], batch size: 105, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:08:04,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 03:08:07,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:10,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:08:10,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:08:11,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:13,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 03:08:13,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:08:15,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=229480.0, ans=0.2 2023-09-29 03:08:17,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 03:08:19,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:08:21,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:24,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:08:24,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 03:08:24,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:08:26,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:26,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:29,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:08:31,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 03:08:31,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:08:31,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:31,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 03:08:45,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:08:49,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:08:51,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:51,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:51,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:08:55,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=229680.0, ans=0.09899494936611666 2023-09-29 03:08:58,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:59,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:09:01,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:09:01,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:09:02,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:09:02,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:09:05,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=229680.0, ans=0.125 2023-09-29 03:09:06,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:06,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:07,767 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.133e+02 2.441e+02 2.869e+02 4.948e+02, threshold=4.883e+02, percent-clipped=0.0 2023-09-29 03:09:11,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:09:11,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 03:09:11,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:09:13,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:13,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:09:15,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:09:17,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:17,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=229746.66666666666, ans=0.1 2023-09-29 03:09:18,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=229746.66666666666, ans=0.0 2023-09-29 03:09:23,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:09:26,268 INFO [train.py:1039] (2/4) Epoch 7, batch 2600, loss[loss=0.2459, simple_loss=0.3002, pruned_loss=0.09579, over 23567.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2914, pruned_loss=0.08225, over 4710531.31 frames. ], batch size: 149, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:09:26,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:28,500 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 03:09:31,581 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 03:09:31,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:09:31,659 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 03:09:33,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 03:09:34,612 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 03:09:37,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:37,689 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 03:09:37,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=229813.33333333334, ans=0.0 2023-09-29 03:09:39,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 03:09:42,773 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 03:09:42,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:09:45,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 03:09:45,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=229880.0, ans=0.125 2023-09-29 03:09:46,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 03:09:49,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:09:49,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 03:09:53,319 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 03:09:53,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 03:09:55,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=229880.0, ans=0.07 2023-09-29 03:09:59,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:09:59,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:01,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:01,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 03:10:03,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:10:07,996 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 03:10:14,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:16,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:16,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=230013.33333333334, ans=0.1 2023-09-29 03:10:16,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=230013.33333333334, ans=0.0 2023-09-29 03:10:17,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 03:10:17,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=230013.33333333334, ans=0.125 2023-09-29 03:10:19,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:19,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:19,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 03:10:22,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:10:22,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:10:24,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:24,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=230013.33333333334, ans=0.125 2023-09-29 03:10:29,534 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 03:10:29,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:30,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:10:35,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:37,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:10:37,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 03:10:37,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:40,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:10:42,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:10:46,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 03:10:48,336 INFO [train.py:1039] (2/4) Epoch 7, batch 2650, loss[loss=0.2224, simple_loss=0.2954, pruned_loss=0.07471, over 24669.00 frames. ], tot_loss[loss=0.2291, simple_loss=0.2924, pruned_loss=0.08288, over 4710265.27 frames. ], batch size: 65, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:10:48,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:50,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:10:52,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=230146.66666666666, ans=0.125 2023-09-29 03:10:54,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 03:10:54,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:55,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:10:57,400 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 03:10:57,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:10:59,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:03,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:11:05,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:11:07,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=230213.33333333334, ans=0.125 2023-09-29 03:11:08,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:11:09,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 03:11:09,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:11:10,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:11:13,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 03:11:15,125 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 03:11:18,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:18,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 03:11:19,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:21,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 03:11:26,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:11:26,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:31,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 03:11:31,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 03:11:35,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:11:38,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 03:11:39,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:41,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:41,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:11:41,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:41,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:41,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=230346.66666666666, ans=0.2 2023-09-29 03:11:44,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:46,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:48,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:48,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:11:49,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:11:51,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:51,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:11:52,731 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.070e+02 2.281e+02 2.771e+02 4.083e+02, threshold=4.562e+02, percent-clipped=0.0 2023-09-29 03:11:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:54,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:54,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:11:54,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=230413.33333333334, ans=0.125 2023-09-29 03:11:58,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:00,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:12:00,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:00,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 03:12:05,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:07,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:07,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:07,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:08,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:12:10,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:10,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=230480.0, ans=0.125 2023-09-29 03:12:11,471 INFO [train.py:1039] (2/4) Epoch 7, batch 2700, loss[loss=0.2146, simple_loss=0.2943, pruned_loss=0.06744, over 24100.00 frames. ], tot_loss[loss=0.2294, simple_loss=0.2927, pruned_loss=0.08299, over 4711524.95 frames. ], batch size: 80, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:12:13,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:13,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 03:12:16,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:12:16,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=230480.0, ans=0.0 2023-09-29 03:12:18,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:12:21,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:12:21,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:21,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:22,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:12:22,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:22,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:12:22,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:12:24,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 03:12:24,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:12:27,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:12:28,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:12:29,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:32,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:12:32,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 03:12:33,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:12:42,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:12:42,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:12:47,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:12:47,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:48,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:12:48,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:12:51,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:12:54,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:54,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:12:54,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:12:55,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=230613.33333333334, ans=0.1 2023-09-29 03:12:57,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=230613.33333333334, ans=0.07 2023-09-29 03:12:58,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:58,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:13:08,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:13:08,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:13,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:13:13,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:16,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:17,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:19,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:13:20,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:22,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:22,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:25,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:13:26,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:26,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:30,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 03:13:30,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:32,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:13:32,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 03:13:34,010 INFO [train.py:1039] (2/4) Epoch 7, batch 2750, loss[loss=0.2152, simple_loss=0.2944, pruned_loss=0.06799, over 24638.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.2922, pruned_loss=0.08199, over 4715895.22 frames. ], batch size: 65, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:13:34,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 03:13:34,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:38,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:13:38,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:41,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:41,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:13:42,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:45,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:13:47,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:13:47,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:13:47,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:47,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 03:13:47,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:13:47,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:51,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=230880.0, ans=0.125 2023-09-29 03:13:54,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 03:13:57,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:57,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:57,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:57,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:13:58,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:00,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:14:01,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:01,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:01,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=230880.0, ans=0.1 2023-09-29 03:14:08,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:14:08,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:14:08,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:14:09,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:09,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:14:16,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:18,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:14:18,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:23,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:23,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:14:25,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:14:31,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:14:31,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:14:31,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 03:14:36,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:37,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 03:14:38,314 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 2.012e+02 2.382e+02 2.660e+02 4.649e+02, threshold=4.763e+02, percent-clipped=1.0 2023-09-29 03:14:41,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:14:44,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:14:44,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 03:14:46,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:14:50,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:14:50,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 03:14:50,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:14:54,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:14:55,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:14:55,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:14:55,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 03:14:55,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:57,120 INFO [train.py:1039] (2/4) Epoch 7, batch 2800, loss[loss=0.239, simple_loss=0.3101, pruned_loss=0.08397, over 23630.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.2912, pruned_loss=0.08099, over 4729619.77 frames. ], batch size: 85, lr: 1.47e-02, grad_scale: 32.0 2023-09-29 03:14:57,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:00,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:00,167 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 03:15:00,168 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 03:15:03,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:04,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:15:06,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:15:09,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:15:12,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 03:15:15,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:15:17,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 03:15:17,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:19,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:15:19,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:23,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:23,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:23,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:15:25,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:15:34,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:15:37,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:40,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:40,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:15:42,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:44,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.29 vs. limit=15.0 2023-09-29 03:15:48,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:15:48,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 03:15:48,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:50,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:50,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:15:53,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:55,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:59,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:16:01,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:16:04,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:04,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:16:04,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:16:04,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:16:05,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:16:05,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 03:16:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:16:08,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 03:16:10,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:10,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:16:10,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:16:10,500 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:16:11,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 03:16:17,961 INFO [train.py:1039] (2/4) Epoch 7, batch 2850, loss[loss=0.2475, simple_loss=0.3055, pruned_loss=0.09472, over 23830.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.2911, pruned_loss=0.08036, over 4742026.62 frames. ], batch size: 179, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:16:18,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:16:18,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:16:19,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:16:21,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:24,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:16:25,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:16:25,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:16:26,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=231480.0, ans=0.125 2023-09-29 03:16:29,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:29,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:31,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:16:31,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 03:16:38,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 03:16:38,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:40,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 03:16:41,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:46,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 03:16:46,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 03:16:46,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=231546.66666666666, ans=0.125 2023-09-29 03:16:47,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:51,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=231613.33333333334, ans=0.0 2023-09-29 03:17:00,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:00,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:00,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:17:02,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:17:02,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:17:02,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:17:04,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:17:04,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 03:17:06,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:17:07,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:07,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:09,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:12,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:16,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:16,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=231680.0, ans=0.125 2023-09-29 03:17:17,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.56 vs. limit=15.0 2023-09-29 03:17:19,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:17:19,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:21,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:24,052 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.041e+02 2.225e+02 2.602e+02 4.724e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 03:17:24,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:17:28,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:17:30,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 03:17:30,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 03:17:31,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:17:33,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:33,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 03:17:35,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:17:35,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:35,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:35,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:17:35,474 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 03:17:36,064 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=15.0 2023-09-29 03:17:37,476 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 03:17:37,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:17:37,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:40,488 INFO [train.py:1039] (2/4) Epoch 7, batch 2900, loss[loss=0.2397, simple_loss=0.2982, pruned_loss=0.09057, over 23700.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2902, pruned_loss=0.07958, over 4735065.82 frames. ], batch size: 232, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:17:42,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:17:42,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:42,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:44,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 03:17:49,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:49,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 03:17:49,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 03:17:51,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:17:51,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:17:54,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:55,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:18:00,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:18:00,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:18:02,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:18:03,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 03:18:03,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:18:05,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:07,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 03:18:09,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 03:18:11,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:18:11,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 03:18:12,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:18:14,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:18:14,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:18:17,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:18:17,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:24,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:18:26,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:27,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 03:18:27,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 03:18:27,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:18:32,031 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.34 vs. limit=15.0 2023-09-29 03:18:32,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:18:34,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 03:18:35,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:18:42,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:51,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:18:51,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:18:52,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 03:18:56,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:56,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 03:18:56,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:18:58,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:19:02,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:19:04,277 INFO [train.py:1039] (2/4) Epoch 7, batch 2950, loss[loss=0.2267, simple_loss=0.2837, pruned_loss=0.08486, over 23675.00 frames. ], tot_loss[loss=0.2263, simple_loss=0.2915, pruned_loss=0.08056, over 4727474.31 frames. ], batch size: 149, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:19:04,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 03:19:05,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:06,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:06,991 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.11 vs. limit=15.0 2023-09-29 03:19:07,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:07,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=232146.66666666666, ans=0.125 2023-09-29 03:19:09,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:19:10,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 03:19:11,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 03:19:12,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:19:12,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:19,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:22,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:23,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:19:25,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:29,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:19:29,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:19:30,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:19:35,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 03:19:39,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 03:19:39,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=232280.0, ans=0.125 2023-09-29 03:19:40,692 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 03:19:40,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:19:42,344 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 03:19:44,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 03:19:44,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:45,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:45,769 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 03:19:45,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:19:46,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=232280.0, ans=0.125 2023-09-29 03:19:50,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 03:19:50,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:51,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:19:54,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:56,112 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.99 vs. limit=15.0 2023-09-29 03:19:57,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:19:58,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:19:58,554 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 03:19:58,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:58,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 03:20:06,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:08,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:09,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 03:20:09,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:20:11,822 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.080e+02 2.429e+02 2.872e+02 4.397e+02, threshold=4.858e+02, percent-clipped=0.0 2023-09-29 03:20:12,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 03:20:15,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:16,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:20:16,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:20:18,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:19,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:20:21,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:20:21,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=232413.33333333334, ans=0.125 2023-09-29 03:20:22,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.62 vs. limit=22.5 2023-09-29 03:20:23,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:23,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:20:23,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:20:23,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:23,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=232413.33333333334, ans=0.0 2023-09-29 03:20:24,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:20:26,222 INFO [train.py:1039] (2/4) Epoch 7, batch 3000, loss[loss=0.2423, simple_loss=0.2949, pruned_loss=0.09487, over 23823.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.2923, pruned_loss=0.0816, over 4730169.83 frames. ], batch size: 195, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:20:26,222 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 03:20:40,693 INFO [train.py:1071] (2/4) Epoch 7, validation: loss=0.3621, simple_loss=0.3045, pruned_loss=0.2099, over 1125622.00 frames. 2023-09-29 03:20:40,694 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 03:20:40,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:40,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 03:20:42,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:44,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=232480.0, ans=0.025 2023-09-29 03:20:45,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:20:47,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:20:50,579 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 03:20:50,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 03:20:52,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:54,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:20:54,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 03:20:54,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:21:01,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:21:02,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=232546.66666666666, ans=0.2 2023-09-29 03:21:11,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:21:17,278 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.33 vs. limit=15.0 2023-09-29 03:21:18,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 03:21:18,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:21:21,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:21:23,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:21:23,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:21:24,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:24,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 03:21:25,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=232613.33333333334, ans=0.125 2023-09-29 03:21:28,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 03:21:30,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:21:30,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:21:31,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:21:31,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:33,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:33,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:21:36,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:21:38,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:38,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:21:39,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:42,376 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.86 vs. limit=10.0 2023-09-29 03:21:42,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 03:21:43,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:21:43,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:21:43,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:21:46,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:46,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:48,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:21:48,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 03:21:48,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:21:49,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 03:21:50,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:21:51,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 03:21:55,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:21:57,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:21:57,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 03:21:57,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 03:21:57,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:21:58,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:22:00,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:22:01,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:22:01,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:01,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:22:03,347 INFO [train.py:1039] (2/4) Epoch 7, batch 3050, loss[loss=0.2404, simple_loss=0.3026, pruned_loss=0.08907, over 23168.00 frames. ], tot_loss[loss=0.2288, simple_loss=0.2932, pruned_loss=0.0822, over 4724616.18 frames. ], batch size: 105, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:22:05,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 03:22:08,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:11,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:11,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:22:14,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:17,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 03:22:21,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=232880.0, ans=0.125 2023-09-29 03:22:24,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 03:22:25,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 03:22:25,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:31,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:22:35,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:35,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:35,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=232946.66666666666, ans=0.2 2023-09-29 03:22:36,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:39,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:22:39,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:22:39,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:41,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:41,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:42,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:44,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:46,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:46,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 03:22:48,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:48,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:22:51,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=233013.33333333334, ans=0.125 2023-09-29 03:22:52,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:53,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:22:54,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:22:54,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:22:54,412 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:23:00,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:23:00,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:03,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=233013.33333333334, ans=0.0 2023-09-29 03:23:09,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:09,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:09,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:23:11,240 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.044e+02 2.286e+02 2.664e+02 4.744e+02, threshold=4.572e+02, percent-clipped=0.0 2023-09-29 03:23:11,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:11,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:23:12,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:23:14,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 03:23:14,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:15,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:17,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 03:23:17,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=233080.0, ans=0.0 2023-09-29 03:23:20,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:25,320 INFO [train.py:1039] (2/4) Epoch 7, batch 3100, loss[loss=0.2312, simple_loss=0.2777, pruned_loss=0.09235, over 23475.00 frames. ], tot_loss[loss=0.2296, simple_loss=0.2929, pruned_loss=0.08309, over 4714549.55 frames. ], batch size: 285, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:23:26,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:27,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:23:30,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:23:31,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 03:23:35,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 03:23:37,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 03:23:38,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:23:43,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:23:43,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:45,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:23:48,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:53,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 03:23:58,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:23:59,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:59,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:23:59,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:01,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:24:04,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:24:04,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 03:24:04,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:24:05,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:07,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 03:24:09,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:24:13,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:24:14,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 03:24:16,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 03:24:17,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:18,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:18,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=233346.66666666666, ans=0.125 2023-09-29 03:24:21,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:21,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:21,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:24:23,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:24:23,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:24:24,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:24:26,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:24:26,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:26,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:24:29,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:24:29,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 03:24:32,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:24:33,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 03:24:33,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:33,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:35,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 03:24:37,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=233413.33333333334, ans=0.125 2023-09-29 03:24:47,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 03:24:48,343 INFO [train.py:1039] (2/4) Epoch 7, batch 3150, loss[loss=0.2131, simple_loss=0.2862, pruned_loss=0.07002, over 24644.00 frames. ], tot_loss[loss=0.2275, simple_loss=0.2905, pruned_loss=0.0823, over 4691192.15 frames. ], batch size: 68, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:24:50,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:50,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:51,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:51,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:24:53,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 03:24:55,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:55,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:24:57,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 03:24:58,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:00,775 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 03:25:01,702 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.00 vs. limit=15.0 2023-09-29 03:25:03,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 03:25:04,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:05,472 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 03:25:06,121 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=11.36 vs. limit=12.0 2023-09-29 03:25:06,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:25:08,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 03:25:08,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 03:25:08,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 03:25:08,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:08,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:10,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 03:25:13,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:14,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:15,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:17,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:25:22,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 03:25:23,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:25:23,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=233613.33333333334, ans=0.125 2023-09-29 03:25:26,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:25:27,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:27,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 03:25:31,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 03:25:32,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:25:32,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:25:32,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:25:33,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:33,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:25:35,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:25:35,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=233613.33333333334, ans=0.125 2023-09-29 03:25:36,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:25:38,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 03:25:39,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:25:39,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:41,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:25:41,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:42,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 03:25:42,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:44,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 03:25:44,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:44,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 03:25:46,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 03:25:46,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=233680.0, ans=0.09899494936611666 2023-09-29 03:25:48,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:25:50,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:51,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 03:25:53,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:25:53,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:56,377 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 2.159e+02 2.421e+02 2.808e+02 3.931e+02, threshold=4.841e+02, percent-clipped=0.0 2023-09-29 03:25:56,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:58,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:58,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:26:04,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:26:04,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:07,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 03:26:11,313 INFO [train.py:1039] (2/4) Epoch 7, batch 3200, loss[loss=0.2629, simple_loss=0.3278, pruned_loss=0.09903, over 23740.00 frames. ], tot_loss[loss=0.2268, simple_loss=0.2897, pruned_loss=0.08194, over 4683274.38 frames. ], batch size: 85, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:26:14,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:26:14,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:26:14,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=233813.33333333334, ans=0.125 2023-09-29 03:26:18,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:20,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:26:20,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 03:26:23,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:26:27,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:26:30,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:39,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:26:43,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=233946.66666666666, ans=0.125 2023-09-29 03:26:51,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 03:26:51,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:26:54,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 03:26:56,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:27:01,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:27:01,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:27:01,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:27:04,002 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=15.0 2023-09-29 03:27:04,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 03:27:06,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:27:07,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 03:27:09,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=234013.33333333334, ans=0.1 2023-09-29 03:27:10,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 03:27:12,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:27:16,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=234080.0, ans=0.125 2023-09-29 03:27:19,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:19,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:27:19,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:20,902 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 03:27:20,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:27:25,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:27,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 03:27:28,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 03:27:30,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 03:27:32,181 INFO [train.py:1039] (2/4) Epoch 7, batch 3250, loss[loss=0.2202, simple_loss=0.2958, pruned_loss=0.07234, over 24145.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.2894, pruned_loss=0.08112, over 4700163.92 frames. ], batch size: 80, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:27:32,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 03:27:33,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:27:37,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:27:37,026 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 03:27:38,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:27:38,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:27:40,000 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 03:27:43,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:27:46,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:27:53,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:27:53,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 03:27:53,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:54,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:54,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:27:55,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=234213.33333333334, ans=0.1 2023-09-29 03:27:56,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:27:56,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:28:00,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:00,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:28:02,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:02,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:05,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:07,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:28:08,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=234280.0, ans=0.035 2023-09-29 03:28:09,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:10,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:12,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:12,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:28:12,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:17,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 03:28:17,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:28:17,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:28:18,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:18,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:28:25,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:28:35,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:35,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:35,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 03:28:35,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:28:35,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:28:37,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:38,671 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.002e+02 2.212e+02 2.720e+02 4.684e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 03:28:38,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 03:28:38,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 03:28:38,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:41,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:42,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:44,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:28:44,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:48,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:48,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:28:50,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 03:28:50,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:28:52,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:28:52,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 03:28:53,377 INFO [train.py:1039] (2/4) Epoch 7, batch 3300, loss[loss=0.2253, simple_loss=0.2952, pruned_loss=0.07774, over 24638.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2906, pruned_loss=0.0814, over 4709662.65 frames. ], batch size: 65, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:28:55,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:55,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 03:28:55,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=234480.0, ans=0.2 2023-09-29 03:28:57,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=234480.0, ans=0.0 2023-09-29 03:28:58,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 03:29:00,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 03:29:00,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:03,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:29:05,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:29:06,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:06,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:29:08,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:29:10,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:12,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:29:16,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 03:29:18,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:19,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:20,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:20,503 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 03:29:23,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:29:23,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:29:23,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:29:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:29:23,600 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 03:29:28,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:28,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:29:30,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:30,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 03:29:30,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=234613.33333333334, ans=0.04949747468305833 2023-09-29 03:29:31,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 03:29:31,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:34,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:29:35,009 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 03:29:38,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 03:29:38,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:29:38,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=234613.33333333334, ans=0.125 2023-09-29 03:29:42,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 03:29:45,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:29:48,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:29:48,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:29:53,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:53,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:53,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:54,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:29:56,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:29:57,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=234680.0, ans=0.125 2023-09-29 03:29:58,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:58,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:29:59,979 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 03:30:00,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 03:30:03,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:30:04,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:04,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:06,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:30:06,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:07,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:30:07,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:07,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:30:09,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:30:11,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:30:15,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 03:30:15,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:16,656 INFO [train.py:1039] (2/4) Epoch 7, batch 3350, loss[loss=0.2104, simple_loss=0.2877, pruned_loss=0.06658, over 24488.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2911, pruned_loss=0.08093, over 4710641.53 frames. ], batch size: 63, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:30:16,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:18,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:30:18,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:30:20,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:23,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:23,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:26,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:30:28,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:29,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:30:31,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:34,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:30:34,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:36,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:30:36,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 03:30:38,069 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 03:30:39,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:43,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 03:30:43,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 03:30:46,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:30:46,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:30:46,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:47,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 03:30:47,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:47,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:30:49,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:51,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:52,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:52,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:30:56,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:30:59,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:59,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:02,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:31:04,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:07,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:07,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:10,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:12,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 03:31:12,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:31:12,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 03:31:14,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:31:15,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 03:31:17,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:17,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:23,272 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.937e+02 2.206e+02 2.498e+02 3.654e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 03:31:25,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=235080.0, ans=0.125 2023-09-29 03:31:27,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:27,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 03:31:28,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:28,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:31:30,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:31:35,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:38,485 INFO [train.py:1039] (2/4) Epoch 7, batch 3400, loss[loss=0.2289, simple_loss=0.3116, pruned_loss=0.0731, over 24337.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2923, pruned_loss=0.08169, over 4711134.93 frames. ], batch size: 74, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:31:39,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 03:31:39,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:31:39,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:31:40,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:42,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 03:31:42,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=235146.66666666666, ans=0.125 2023-09-29 03:31:43,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:43,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 03:31:45,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:31:48,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:31:48,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 03:31:51,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 03:31:52,017 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 03:31:52,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:56,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:56,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:56,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:31:58,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:32:03,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:05,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 03:32:12,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:32:14,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:15,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:17,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:32:22,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:32:25,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 03:32:30,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 03:32:31,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:32:33,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:33,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:35,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:32:38,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:39,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn2.whiten.whitening_limit, batch_count=235346.66666666666, ans=22.5 2023-09-29 03:32:43,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:32:43,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:32:51,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:32:52,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 03:32:52,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=235413.33333333334, ans=0.0 2023-09-29 03:32:57,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:33:00,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 03:33:01,929 INFO [train.py:1039] (2/4) Epoch 7, batch 3450, loss[loss=0.2258, simple_loss=0.2861, pruned_loss=0.08273, over 23531.00 frames. ], tot_loss[loss=0.2275, simple_loss=0.2922, pruned_loss=0.08143, over 4700832.42 frames. ], batch size: 120, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:33:06,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 03:33:07,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=235480.0, ans=0.0 2023-09-29 03:33:08,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:09,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:33:09,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 03:33:10,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:33:13,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:33:18,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:33:18,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:20,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:33:20,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:23,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:30,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 03:33:35,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 03:33:35,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:33:35,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:33:38,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:42,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 03:33:43,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:33:48,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:33:50,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:50,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:33:51,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:33:54,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 03:33:54,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:33:57,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:34:00,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:03,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 03:34:06,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:34:09,684 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.982e+02 2.278e+02 2.656e+02 4.314e+02, threshold=4.555e+02, percent-clipped=0.0 2023-09-29 03:34:11,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:34:13,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:16,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:21,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:21,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:34:21,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:34:23,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:34:24,905 INFO [train.py:1039] (2/4) Epoch 7, batch 3500, loss[loss=0.2297, simple_loss=0.3018, pruned_loss=0.07881, over 24304.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.2901, pruned_loss=0.08075, over 4715143.02 frames. ], batch size: 74, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:34:26,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:30,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:34:30,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 03:34:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:34:35,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:34:37,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:37,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 03:34:42,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:34:44,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:44,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:34:44,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:34:45,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:34:47,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:47,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:34:47,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 03:34:49,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:50,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:34:52,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:34:52,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=235880.0, ans=0.1 2023-09-29 03:34:56,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:57,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 03:34:57,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:35:01,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:35:04,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:35:04,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:07,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:35:08,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=235946.66666666666, ans=0.125 2023-09-29 03:35:09,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:12,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 03:35:12,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 03:35:14,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 03:35:14,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:15,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:16,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:17,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:35:21,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:35:21,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:35:21,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=236013.33333333334, ans=0.07 2023-09-29 03:35:27,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:35:28,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 03:35:28,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 03:35:28,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:35:32,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:32,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:32,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=236080.0, ans=0.0 2023-09-29 03:35:35,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:39,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 03:35:39,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:42,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:42,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 03:35:44,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 03:35:47,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:47,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:47,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:35:47,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:35:49,006 INFO [train.py:1039] (2/4) Epoch 7, batch 3550, loss[loss=0.2465, simple_loss=0.2955, pruned_loss=0.09873, over 23572.00 frames. ], tot_loss[loss=0.2235, simple_loss=0.2879, pruned_loss=0.07953, over 4710203.77 frames. ], batch size: 256, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:35:50,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:36:00,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:00,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=236146.66666666666, ans=0.125 2023-09-29 03:36:03,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:36:06,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:08,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:36:08,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:10,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:36:10,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:36:14,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:15,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:36:15,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:15,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:36:15,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:36:18,213 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.31 vs. limit=10.0 2023-09-29 03:36:20,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=236280.0, ans=0.035 2023-09-29 03:36:22,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:36:22,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:23,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:23,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:23,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:36:23,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 03:36:25,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:26,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:28,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:36:34,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:34,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:36,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:38,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 03:36:40,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:36:40,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=236346.66666666666, ans=22.5 2023-09-29 03:36:41,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 03:36:41,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:43,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:36:43,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:36:46,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 03:36:47,551 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.94 vs. limit=15.0 2023-09-29 03:36:50,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:54,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:56,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 03:36:56,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:36:57,955 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.050e+02 2.243e+02 2.592e+02 3.943e+02, threshold=4.485e+02, percent-clipped=0.0 2023-09-29 03:36:59,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:37:01,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 03:37:07,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.00 vs. limit=15.0 2023-09-29 03:37:07,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 03:37:07,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:07,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:37:10,910 INFO [train.py:1039] (2/4) Epoch 7, batch 3600, loss[loss=0.2121, simple_loss=0.2808, pruned_loss=0.07172, over 24334.00 frames. ], tot_loss[loss=0.2228, simple_loss=0.2874, pruned_loss=0.07912, over 4704936.26 frames. ], batch size: 61, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:37:11,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:12,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:37:16,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:18,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:18,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:37:20,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:37:21,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:21,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 03:37:23,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=236480.0, ans=0.125 2023-09-29 03:37:25,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:37:26,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:27,371 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.96 vs. limit=15.0 2023-09-29 03:37:29,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:32,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:33,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:37:34,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:34,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 03:37:34,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:38,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:38,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:37:41,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:37:41,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=236546.66666666666, ans=0.125 2023-09-29 03:37:42,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:43,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:37:45,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 03:37:46,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=236613.33333333334, ans=0.125 2023-09-29 03:37:51,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:54,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:37:54,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 03:37:57,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:38:00,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=236680.0, ans=0.0 2023-09-29 03:38:02,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:05,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:11,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:38:11,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:38:12,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 03:38:12,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=236680.0, ans=0.1 2023-09-29 03:38:13,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 03:38:13,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 03:38:14,398 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.61 vs. limit=22.5 2023-09-29 03:38:15,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:38:17,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:38:18,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 03:38:20,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:20,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:38:20,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:22,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 03:38:23,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 03:38:27,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:28,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 03:38:33,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=236813.33333333334, ans=0.125 2023-09-29 03:38:34,832 INFO [train.py:1039] (2/4) Epoch 7, batch 3650, loss[loss=0.2336, simple_loss=0.3011, pruned_loss=0.08303, over 23359.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2887, pruned_loss=0.08038, over 4694533.72 frames. ], batch size: 93, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:38:34,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 03:38:36,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:38:39,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 03:38:42,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 03:38:46,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:38:46,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:38:47,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:38:51,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:38:51,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:51,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 03:38:51,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:38:53,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:53,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 03:38:55,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:38:55,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:38:57,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:38:58,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:39:00,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 03:39:02,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 03:39:04,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:39:05,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 03:39:05,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:05,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:39:13,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:39:16,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:16,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:39:17,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:39:19,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:39:21,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:39:24,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:26,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:26,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:29,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:39:29,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:29,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=237013.33333333334, ans=0.0 2023-09-29 03:39:30,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:36,514 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 03:39:41,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:39:41,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:43,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:39:43,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:44,550 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.001e+02 2.357e+02 2.750e+02 4.366e+02, threshold=4.713e+02, percent-clipped=0.0 2023-09-29 03:39:44,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:39:46,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:47,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 03:39:47,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:51,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:39:52,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:52,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:39:56,980 INFO [train.py:1039] (2/4) Epoch 7, batch 3700, loss[loss=0.2288, simple_loss=0.3033, pruned_loss=0.07717, over 24411.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.2908, pruned_loss=0.08175, over 4691603.48 frames. ], batch size: 77, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:39:57,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:57,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 03:39:57,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:57,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:39:58,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:40:02,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:40:04,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:06,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:07,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:40:07,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:40:09,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:40:10,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:12,742 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 03:40:19,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=237213.33333333334, ans=0.0 2023-09-29 03:40:19,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=237213.33333333334, ans=0.0 2023-09-29 03:40:20,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:40:21,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:40:23,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:40:23,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 03:40:23,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:29,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:30,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 03:40:30,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:32,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:40:35,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:35,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:40:37,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:40:37,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.58 vs. limit=15.0 2023-09-29 03:40:42,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:42,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 03:40:42,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:43,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 03:40:44,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=237346.66666666666, ans=0.2 2023-09-29 03:40:50,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:40:50,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:40:50,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=237346.66666666666, ans=0.125 2023-09-29 03:40:53,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:55,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 03:40:58,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:40:58,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:40:58,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:40:58,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:59,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=237413.33333333334, ans=0.1 2023-09-29 03:41:01,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:41:02,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 03:41:02,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 03:41:04,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:41:04,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:07,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:41:07,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:41:11,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:41:11,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=237413.33333333334, ans=0.0 2023-09-29 03:41:14,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:41:15,800 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.64 vs. limit=22.5 2023-09-29 03:41:17,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:41:18,614 INFO [train.py:1039] (2/4) Epoch 7, batch 3750, loss[loss=0.2407, simple_loss=0.2882, pruned_loss=0.09656, over 22710.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.2913, pruned_loss=0.08168, over 4701485.76 frames. ], batch size: 322, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:41:18,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 03:41:20,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:41:23,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:41:23,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 03:41:25,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:41:27,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:27,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:30,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:41:33,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:37,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:41:39,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:41:40,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:41:42,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=237546.66666666666, ans=0.0 2023-09-29 03:41:44,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:41:45,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 03:41:47,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:41:48,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:49,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:54,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 03:41:55,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=237613.33333333334, ans=0.125 2023-09-29 03:41:57,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 03:41:59,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:59,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:42:01,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:05,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=237613.33333333334, ans=0.125 2023-09-29 03:42:06,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:07,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:42:10,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 03:42:11,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=237680.0, ans=0.0 2023-09-29 03:42:13,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:17,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:42:18,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:42:20,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:42:27,706 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.086e+02 2.277e+02 2.555e+02 3.671e+02, threshold=4.554e+02, percent-clipped=0.0 2023-09-29 03:42:27,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:42:29,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:42:32,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:42:32,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:42:35,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:42:40,456 INFO [train.py:1039] (2/4) Epoch 7, batch 3800, loss[loss=0.2203, simple_loss=0.2715, pruned_loss=0.08454, over 23783.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2907, pruned_loss=0.08103, over 4695554.84 frames. ], batch size: 212, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:42:45,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:42:49,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:49,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:42:51,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 03:42:53,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:55,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:42:55,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:42:56,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 03:42:56,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:58,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:43:01,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:43:01,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:43:01,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:03,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 03:43:06,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 03:43:08,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:43:09,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:13,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:43:14,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:43:16,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:43:17,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:19,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:20,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:26,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:43:26,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 03:43:27,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:32,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:43:41,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:43:44,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 03:43:46,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 03:43:48,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:49,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:51,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:52,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 03:43:56,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 03:43:56,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 03:43:56,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:57,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:44:03,072 INFO [train.py:1039] (2/4) Epoch 7, batch 3850, loss[loss=0.2106, simple_loss=0.2603, pruned_loss=0.08049, over 23592.00 frames. ], tot_loss[loss=0.2255, simple_loss=0.2901, pruned_loss=0.08048, over 4698952.87 frames. ], batch size: 256, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:44:03,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:44:03,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=238146.66666666666, ans=0.125 2023-09-29 03:44:04,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:44:11,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:44:11,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 03:44:12,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:44:15,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:18,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:44:20,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:23,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:44:24,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 03:44:29,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:31,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:34,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:34,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:44:36,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=238280.0, ans=0.0 2023-09-29 03:44:38,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=238280.0, ans=0.0 2023-09-29 03:44:39,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:39,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:44:40,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:40,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:44:41,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:45,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:44:46,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 03:44:46,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 03:44:48,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:48,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:52,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 03:44:56,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 03:44:58,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:59,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 03:45:01,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:45:01,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=238346.66666666666, ans=0.125 2023-09-29 03:45:01,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=238346.66666666666, ans=0.125 2023-09-29 03:45:06,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:08,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:45:13,053 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.089e+02 2.371e+02 2.859e+02 5.421e+02, threshold=4.742e+02, percent-clipped=3.0 2023-09-29 03:45:13,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:13,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 03:45:15,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 03:45:18,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:19,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:19,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:45:19,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:45:21,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:22,287 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.91 vs. limit=15.0 2023-09-29 03:45:23,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:23,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:45:23,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 03:45:24,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:45:26,776 INFO [train.py:1039] (2/4) Epoch 7, batch 3900, loss[loss=0.215, simple_loss=0.2934, pruned_loss=0.06833, over 24576.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2899, pruned_loss=0.08002, over 4718853.60 frames. ], batch size: 71, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:45:28,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 03:45:28,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:28,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:29,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:45:29,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:30,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=238480.0, ans=0.0 2023-09-29 03:45:31,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:45:32,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:32,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:33,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:45:33,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 03:45:34,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:39,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:39,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:41,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:45:41,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:43,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:44,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:46,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:45:47,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 03:45:47,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:45:50,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 03:45:50,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:50,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 03:45:52,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 03:45:54,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.40 vs. limit=6.0 2023-09-29 03:45:57,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:45:59,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:45:59,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:46:01,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:01,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=238613.33333333334, ans=0.04949747468305833 2023-09-29 03:46:03,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=238613.33333333334, ans=0.125 2023-09-29 03:46:05,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:46:07,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:46:07,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=238613.33333333334, ans=0.0 2023-09-29 03:46:12,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:46:12,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:12,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=238613.33333333334, ans=0.125 2023-09-29 03:46:13,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:46:19,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:20,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:46:25,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:46:26,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:46:32,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=238746.66666666666, ans=0.0 2023-09-29 03:46:37,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:46:39,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:41,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 03:46:41,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 03:46:41,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:41,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.82 vs. limit=15.0 2023-09-29 03:46:42,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 03:46:44,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:44,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 03:46:48,675 INFO [train.py:1039] (2/4) Epoch 7, batch 3950, loss[loss=0.218, simple_loss=0.2702, pruned_loss=0.08291, over 23337.00 frames. ], tot_loss[loss=0.2237, simple_loss=0.2888, pruned_loss=0.07935, over 4719054.68 frames. ], batch size: 105, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:46:52,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:54,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 03:46:55,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:46:57,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:47:00,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:47:00,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=238813.33333333334, ans=0.0 2023-09-29 03:47:03,999 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 03:47:05,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:06,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 03:47:07,454 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 03:47:07,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:11,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:11,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:47:11,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:14,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 03:47:17,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:47:17,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:17,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:47:18,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:47:18,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:47:30,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:47:31,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:47:31,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=238946.66666666666, ans=0.2 2023-09-29 03:47:33,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=238946.66666666666, ans=0.0 2023-09-29 03:47:35,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=238946.66666666666, ans=0.0 2023-09-29 03:47:35,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=238946.66666666666, ans=0.1 2023-09-29 03:47:36,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 03:47:43,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=239013.33333333334, ans=0.2 2023-09-29 03:47:45,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 03:47:45,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 03:47:45,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:47:45,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:47:50,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=239013.33333333334, ans=0.2 2023-09-29 03:47:51,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:47:52,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:47:52,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:53,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:47:53,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 03:47:55,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=239080.0, ans=0.0 2023-09-29 03:47:57,667 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 2.025e+02 2.218e+02 2.611e+02 3.934e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 03:47:57,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:47:58,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:48:01,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 03:48:03,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=239080.0, ans=0.1 2023-09-29 03:48:11,823 INFO [train.py:1039] (2/4) Epoch 7, batch 4000, loss[loss=0.3248, simple_loss=0.3509, pruned_loss=0.1494, over 19417.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.2905, pruned_loss=0.08069, over 4700079.58 frames. ], batch size: 388, lr: 1.45e-02, grad_scale: 32.0 2023-09-29 03:48:12,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:14,939 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.56 vs. limit=22.5 2023-09-29 03:48:18,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:20,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-09-29 03:48:24,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:24,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:48:26,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:26,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 03:48:27,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:48:27,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 03:48:27,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:48:27,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 03:48:29,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=239213.33333333334, ans=0.2 2023-09-29 03:48:30,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:33,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:48:34,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:48:34,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:48:36,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:48:36,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:48:39,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:48:41,264 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 03:48:41,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:48:43,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:48:46,673 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 03:48:46,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=239280.0, ans=0.2 2023-09-29 03:48:47,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:48:49,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:48:57,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 03:48:57,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:49:00,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:49:02,044 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 03:49:03,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:49:03,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 03:49:03,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:05,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:06,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:49:08,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:49:09,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:49:09,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:49:11,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 03:49:11,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:13,300 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 03:49:18,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:49:18,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=239413.33333333334, ans=0.125 2023-09-29 03:49:20,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:49:21,761 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.01 vs. limit=15.0 2023-09-29 03:49:22,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:49:22,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:23,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:49:25,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:29,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:30,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:49:30,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 03:49:33,608 INFO [train.py:1039] (2/4) Epoch 7, batch 4050, loss[loss=0.315, simple_loss=0.3496, pruned_loss=0.1402, over 19089.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2912, pruned_loss=0.08093, over 4692812.15 frames. ], batch size: 388, lr: 1.44e-02, grad_scale: 32.0 2023-09-29 03:49:33,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:49:33,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:49:33,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:49:36,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:49:38,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:42,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:43,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.75 vs. limit=22.5 2023-09-29 03:49:46,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:49:47,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:49:50,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:49:51,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:56,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:58,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:50:00,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 03:50:03,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 03:50:03,791 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 03:50:05,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:50:07,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.70 vs. limit=10.0 2023-09-29 03:50:12,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 03:50:13,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:15,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:18,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:50:19,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:50:19,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:23,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:50:26,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 03:50:26,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:50:28,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:30,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 03:50:34,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:41,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 03:50:43,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:43,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:50:43,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=239746.66666666666, ans=0.0 2023-09-29 03:50:44,866 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.977e+02 2.189e+02 2.469e+02 3.390e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 03:50:47,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 03:50:47,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 03:50:47,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:50:51,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:50:52,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.38 vs. limit=5.0 2023-09-29 03:50:52,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:50:52,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:50:56,013 INFO [train.py:1039] (2/4) Epoch 7, batch 4100, loss[loss=0.2259, simple_loss=0.282, pruned_loss=0.08493, over 23392.00 frames. ], tot_loss[loss=0.2285, simple_loss=0.2924, pruned_loss=0.08232, over 4693688.99 frames. ], batch size: 134, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:51:01,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 03:51:02,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 03:51:03,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=239813.33333333334, ans=15.0 2023-09-29 03:51:04,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 03:51:07,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 03:51:07,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:09,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:51:10,590 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 03:51:14,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:15,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:51:15,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:17,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:51:19,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:51:20,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:21,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:51:21,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 03:51:23,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:23,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:51:23,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:23,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:51:23,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 03:51:26,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:28,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 03:51:30,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:51:31,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:31,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 03:51:33,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:51:33,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:51:33,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:51:37,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 03:51:38,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:51:40,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:51:41,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-09-29 03:51:47,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 03:51:48,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:50,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:51:53,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:56,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:01,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:02,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:52:09,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:09,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:52:13,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:13,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=240080.0, ans=0.95 2023-09-29 03:52:16,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:52:17,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=240080.0, ans=0.2 2023-09-29 03:52:21,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:52:21,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:52:22,673 INFO [train.py:1039] (2/4) Epoch 7, batch 4150, loss[loss=0.2383, simple_loss=0.3114, pruned_loss=0.08257, over 24426.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.2929, pruned_loss=0.08271, over 4704348.48 frames. ], batch size: 77, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:52:22,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:52:22,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:25,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=240146.66666666666, ans=0.2 2023-09-29 03:52:26,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 03:52:26,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:27,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 03:52:29,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 03:52:29,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 03:52:32,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:34,969 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.30 vs. limit=15.0 2023-09-29 03:52:38,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:52:38,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:43,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:52:45,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:52:45,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:52:46,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:52:48,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:48,430 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:52:49,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:52:54,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:56,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:52:58,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 03:52:58,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=240280.0, ans=0.2 2023-09-29 03:53:01,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 03:53:01,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:53:03,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 03:53:03,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:53:03,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:06,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:06,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:10,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 03:53:13,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:16,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:53:18,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 03:53:18,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:53:19,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 03:53:21,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:53:22,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:24,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:24,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 03:53:24,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:24,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:53:24,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=240346.66666666666, ans=0.2 2023-09-29 03:53:26,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:53:29,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 03:53:29,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:29,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:53:29,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:53:31,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 03:53:32,547 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.142e+02 2.391e+02 2.660e+02 4.088e+02, threshold=4.782e+02, percent-clipped=0.0 2023-09-29 03:53:32,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:32,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:53:32,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:53:34,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:35,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 03:53:36,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:42,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:53:44,023 INFO [train.py:1039] (2/4) Epoch 7, batch 4200, loss[loss=0.2152, simple_loss=0.2874, pruned_loss=0.07154, over 24643.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.2913, pruned_loss=0.08158, over 4708373.27 frames. ], batch size: 65, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:53:44,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 03:53:45,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:53:48,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:53:51,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:53:51,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:51,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:54,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 03:53:57,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 03:53:57,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:59,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:01,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:54:05,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:54:09,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:09,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:10,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 03:54:10,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:11,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:12,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:54:12,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:54:15,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:54:18,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 03:54:18,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:22,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:54:22,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=240613.33333333334, ans=0.125 2023-09-29 03:54:23,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:54:24,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=240613.33333333334, ans=0.125 2023-09-29 03:54:26,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=240613.33333333334, ans=0.0 2023-09-29 03:54:27,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:54:30,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:54:31,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:54:32,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 03:54:33,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:33,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:54:38,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:54:39,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:45,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:54:49,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 03:54:52,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:56,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:54:56,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:54:58,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 03:55:05,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:55:06,373 INFO [train.py:1039] (2/4) Epoch 7, batch 4250, loss[loss=0.1992, simple_loss=0.2659, pruned_loss=0.06628, over 24608.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.2899, pruned_loss=0.08122, over 4712090.39 frames. ], batch size: 60, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:55:08,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=240813.33333333334, ans=0.125 2023-09-29 03:55:09,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:55:09,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:55:09,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=240813.33333333334, ans=0.1 2023-09-29 03:55:12,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:15,731 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.85 vs. limit=15.0 2023-09-29 03:55:16,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.81 vs. limit=15.0 2023-09-29 03:55:17,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:55:19,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 03:55:19,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:55:19,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=240813.33333333334, ans=0.125 2023-09-29 03:55:21,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:24,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:29,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:29,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=240880.0, ans=0.125 2023-09-29 03:55:31,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:33,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:55:33,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:55:34,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:36,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:37,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:40,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:55:42,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:55:44,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 03:55:47,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 03:55:47,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:49,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:49,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:51,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:55:51,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:52,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:56,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:55:57,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:56:00,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:01,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=241013.33333333334, ans=0.125 2023-09-29 03:56:02,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:03,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 03:56:03,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:56:05,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 03:56:07,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:56:09,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:56:12,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:12,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:56:13,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 03:56:15,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:56:16,640 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.803e+02 2.211e+02 2.441e+02 2.743e+02 4.963e+02, threshold=4.882e+02, percent-clipped=1.0 2023-09-29 03:56:16,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:56:19,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:21,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=241080.0, ans=0.0 2023-09-29 03:56:23,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:25,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:56:26,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:28,513 INFO [train.py:1039] (2/4) Epoch 7, batch 4300, loss[loss=0.2431, simple_loss=0.314, pruned_loss=0.08611, over 24050.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2887, pruned_loss=0.0804, over 4718366.93 frames. ], batch size: 80, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:56:28,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:30,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:56:30,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:56:30,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 03:56:31,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:33,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=241146.66666666666, ans=0.0 2023-09-29 03:56:33,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=241146.66666666666, ans=0.05 2023-09-29 03:56:37,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:38,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:56:41,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=241146.66666666666, ans=0.125 2023-09-29 03:56:43,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:44,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=241213.33333333334, ans=0.0 2023-09-29 03:56:50,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:50,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 03:56:51,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:56:54,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:56:54,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:56:54,674 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 03:57:01,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:57:01,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:04,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 03:57:04,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:57:05,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 03:57:05,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=241280.0, ans=0.125 2023-09-29 03:57:05,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=241280.0, ans=0.2 2023-09-29 03:57:07,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:57:08,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=241280.0, ans=0.125 2023-09-29 03:57:09,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:57:11,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:57:11,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:57:13,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:57:14,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:16,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:57:16,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 03:57:16,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 03:57:18,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=241346.66666666666, ans=0.125 2023-09-29 03:57:19,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:57:20,636 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.27 vs. limit=10.0 2023-09-29 03:57:23,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:23,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:57:23,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:24,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:24,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 03:57:24,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 03:57:25,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 03:57:26,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:57:27,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 03:57:27,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 03:57:27,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=241346.66666666666, ans=0.125 2023-09-29 03:57:29,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:31,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=241346.66666666666, ans=0.5 2023-09-29 03:57:31,404 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.32 vs. limit=22.5 2023-09-29 03:57:32,880 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 03:57:32,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:57:36,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:36,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:39,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 03:57:41,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:41,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:41,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:57:41,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:42,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:57:42,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:57:45,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:46,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:46,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:51,827 INFO [train.py:1039] (2/4) Epoch 7, batch 4350, loss[loss=0.2534, simple_loss=0.3075, pruned_loss=0.0996, over 22895.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.2897, pruned_loss=0.08058, over 4715720.20 frames. ], batch size: 322, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:57:53,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 03:57:53,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:57:58,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:01,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:01,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=241480.0, ans=0.0 2023-09-29 03:58:03,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=241480.0, ans=0.125 2023-09-29 03:58:04,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:58:04,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:58:10,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:58:11,503 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.33 vs. limit=15.0 2023-09-29 03:58:13,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:15,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:58:16,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:58:20,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:58:23,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:58:24,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:58:29,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 03:58:31,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:32,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:36,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=241613.33333333334, ans=0.07 2023-09-29 03:58:37,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:39,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 03:58:44,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:58:46,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:58:50,910 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 03:58:52,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:52,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:58:53,868 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 03:58:56,015 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 03:58:56,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:56,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:57,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:58:57,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:59,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:59,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:02,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 03:59:02,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:02,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:02,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:03,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=241746.66666666666, ans=0.07 2023-09-29 03:59:04,151 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.126e+02 2.357e+02 2.632e+02 4.633e+02, threshold=4.715e+02, percent-clipped=0.0 2023-09-29 03:59:04,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 03:59:05,848 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 03:59:05,855 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 03:59:05,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 03:59:08,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:59:09,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:59:09,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:10,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:59:12,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 03:59:13,480 INFO [train.py:1039] (2/4) Epoch 7, batch 4400, loss[loss=0.2592, simple_loss=0.31, pruned_loss=0.1042, over 23455.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2906, pruned_loss=0.08141, over 4713610.83 frames. ], batch size: 285, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:59:15,077 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 03:59:15,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:19,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:20,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:22,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:24,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 03:59:24,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 03:59:25,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 03:59:25,826 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 03:59:25,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:59:26,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:29,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 03:59:31,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:32,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:32,588 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 03:59:34,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=241880.0, ans=0.125 2023-09-29 03:59:37,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:37,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 03:59:39,172 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 03:59:42,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 03:59:43,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 03:59:43,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 03:59:43,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:45,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:45,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:46,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:48,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 03:59:48,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 03:59:50,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:51,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:59:51,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:53,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:53,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:53,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 03:59:55,800 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 03:59:58,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:05,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:00:06,360 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.90 vs. limit=6.0 2023-09-29 04:00:08,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 04:00:10,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=242013.33333333334, ans=0.5 2023-09-29 04:00:15,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:00:16,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:18,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:00:19,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 04:00:19,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:00:19,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:00:19,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:00:21,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:00:25,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 04:00:29,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 04:00:31,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 04:00:31,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:00:31,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 04:00:34,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:00:36,374 INFO [train.py:1039] (2/4) Epoch 7, batch 4450, loss[loss=0.2113, simple_loss=0.2736, pruned_loss=0.07449, over 21916.00 frames. ], tot_loss[loss=0.227, simple_loss=0.2913, pruned_loss=0.08137, over 4726671.56 frames. ], batch size: 48, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:00:36,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:00:38,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 04:00:40,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.04 vs. limit=15.0 2023-09-29 04:00:43,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:44,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:44,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:00:45,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.03 vs. limit=12.0 2023-09-29 04:00:52,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:00:52,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:00:56,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:56,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:01:01,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:01:01,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:02,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 04:01:02,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:02,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:02,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:02,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:01:07,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:01:12,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:12,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:14,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:14,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:16,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:01:19,887 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=15.55 vs. limit=15.0 2023-09-29 04:01:21,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:01:22,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 04:01:22,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 04:01:22,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:01:25,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:27,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 04:01:30,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:01:33,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:35,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 04:01:35,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:35,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:35,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:01:35,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:39,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:41,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:01:42,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 04:01:44,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:01:46,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:47,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:49,216 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.081e+02 2.382e+02 2.836e+02 4.315e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 04:01:49,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:50,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:01:52,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:01:55,026 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.47 vs. limit=10.0 2023-09-29 04:01:55,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 04:01:57,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:01:59,211 INFO [train.py:1039] (2/4) Epoch 7, batch 4500, loss[loss=0.2817, simple_loss=0.3172, pruned_loss=0.1232, over 19556.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.2927, pruned_loss=0.08172, over 4726335.62 frames. ], batch size: 388, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:02:04,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:05,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 04:02:05,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 04:02:08,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:12,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:02:12,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:14,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:02:15,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:02:15,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:15,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:16,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=242546.66666666666, ans=0.0 2023-09-29 04:02:27,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:29,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:02:31,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:32,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:02:34,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:02:40,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:02:44,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:02:49,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:02:49,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=242680.0, ans=0.125 2023-09-29 04:02:50,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:02:53,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 04:02:53,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:02:54,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:56,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:57,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:59,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:59,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 04:02:59,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:02:59,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:04,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:03:04,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:03:07,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:10,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:03:10,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:03:13,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 04:03:13,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 04:03:13,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 04:03:19,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 04:03:22,554 INFO [train.py:1039] (2/4) Epoch 7, batch 4550, loss[loss=0.2422, simple_loss=0.2881, pruned_loss=0.09817, over 23873.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2908, pruned_loss=0.08106, over 4727437.31 frames. ], batch size: 212, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:03:22,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 04:03:22,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:27,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:29,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:31,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:33,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=242813.33333333334, ans=0.125 2023-09-29 04:03:36,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:03:37,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:03:39,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:03:40,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:03:40,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:42,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:45,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:03:47,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=242880.0, ans=0.125 2023-09-29 04:03:49,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 04:03:49,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 04:03:49,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.10 vs. limit=15.0 2023-09-29 04:03:51,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:03:52,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 04:03:56,485 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:03:57,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 04:03:57,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:04:02,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 04:04:04,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:04:07,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:07,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:07,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:04:09,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 04:04:12,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:15,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:15,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:04:16,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=243013.33333333334, ans=0.2 2023-09-29 04:04:17,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:17,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 04:04:17,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 04:04:18,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:04:19,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 04:04:20,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 04:04:20,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:23,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:23,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:25,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:25,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:04:27,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:04:29,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 04:04:31,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:31,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:04:31,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 04:04:31,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:04:31,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 04:04:32,094 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:04:34,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:04:34,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:04:36,258 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.221e+02 2.599e+02 3.023e+02 5.403e+02, threshold=5.198e+02, percent-clipped=1.0 2023-09-29 04:04:37,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:04:40,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:40,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:04:41,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:04:43,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:04:46,064 INFO [train.py:1039] (2/4) Epoch 7, batch 4600, loss[loss=0.2448, simple_loss=0.3074, pruned_loss=0.09111, over 23990.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2886, pruned_loss=0.08074, over 4718466.05 frames. ], batch size: 86, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:04:46,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:47,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:49,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=243146.66666666666, ans=0.2 2023-09-29 04:04:50,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:04:50,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:04:52,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:04:53,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 04:04:54,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=243146.66666666666, ans=0.125 2023-09-29 04:04:55,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:04:59,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:05:01,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:03,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:09,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=243213.33333333334, ans=0.2 2023-09-29 04:05:10,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 04:05:10,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=243213.33333333334, ans=0.2 2023-09-29 04:05:12,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:15,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:17,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:05:17,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:22,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=243280.0, ans=0.0 2023-09-29 04:05:23,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 04:05:23,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:05:25,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:05:32,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:32,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:05:35,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:05:35,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=243346.66666666666, ans=0.1 2023-09-29 04:05:39,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 04:05:41,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:05:41,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=243346.66666666666, ans=0.015 2023-09-29 04:05:45,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:46,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:05:48,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:48,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 04:05:50,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:50,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 04:05:50,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:50,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:52,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:53,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:55,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:56,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 04:05:56,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 04:05:56,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 04:05:56,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:05:58,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:05:59,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:00,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=15.0 2023-09-29 04:06:01,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:06:07,200 INFO [train.py:1039] (2/4) Epoch 7, batch 4650, loss[loss=0.2144, simple_loss=0.2932, pruned_loss=0.06782, over 24648.00 frames. ], tot_loss[loss=0.2235, simple_loss=0.2877, pruned_loss=0.07972, over 4718130.88 frames. ], batch size: 68, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:06:12,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:06:16,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:16,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:18,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:06:18,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:18,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:06:19,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:22,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 04:06:26,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:06:29,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 04:06:29,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:29,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 04:06:31,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:06:31,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 04:06:32,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 04:06:32,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:32,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:06:34,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:06:37,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:37,486 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 04:06:41,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:43,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 04:06:46,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:46,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:06:46,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 04:06:46,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=243613.33333333334, ans=0.1 2023-09-29 04:06:48,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:06:51,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:06:53,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=243613.33333333334, ans=0.0 2023-09-29 04:06:56,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:00,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:01,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=243680.0, ans=0.2 2023-09-29 04:07:04,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:04,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:04,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=243680.0, ans=0.0 2023-09-29 04:07:06,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:07:06,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 04:07:07,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 04:07:07,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 04:07:07,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 04:07:10,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:18,477 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.995e+02 2.331e+02 2.666e+02 3.727e+02, threshold=4.663e+02, percent-clipped=0.0 2023-09-29 04:07:18,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:07:18,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:18,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 04:07:18,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:20,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:20,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:07:22,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:07:24,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:07:24,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:26,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:27,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:29,920 INFO [train.py:1039] (2/4) Epoch 7, batch 4700, loss[loss=0.1905, simple_loss=0.257, pruned_loss=0.06199, over 20816.00 frames. ], tot_loss[loss=0.2243, simple_loss=0.2886, pruned_loss=0.07998, over 4718672.85 frames. ], batch size: 45, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:07:30,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:07:30,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:07:31,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:07:32,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:07:33,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 04:07:40,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.13 vs. limit=15.0 2023-09-29 04:07:40,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:41,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=243813.33333333334, ans=15.0 2023-09-29 04:07:42,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:43,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:07:45,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:47,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:07:50,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 04:07:51,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 04:07:53,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:56,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:07:57,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:59,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=243880.0, ans=0.0 2023-09-29 04:08:01,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:06,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:08:09,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:08:11,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:08:16,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=243946.66666666666, ans=0.0 2023-09-29 04:08:17,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 04:08:19,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:08:22,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:25,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 04:08:26,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:08:28,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=244013.33333333334, ans=10.0 2023-09-29 04:08:32,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:08:32,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 04:08:33,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:33,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:36,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:38,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:08:38,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 04:08:38,515 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 04:08:41,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:43,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 04:08:46,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:49,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 04:08:49,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=244146.66666666666, ans=0.125 2023-09-29 04:08:50,859 INFO [train.py:1039] (2/4) Epoch 7, batch 4750, loss[loss=0.2039, simple_loss=0.2716, pruned_loss=0.06807, over 23514.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2892, pruned_loss=0.08018, over 4721814.09 frames. ], batch size: 120, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:08:52,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:08:52,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:55,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=244146.66666666666, ans=0.125 2023-09-29 04:08:58,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:58,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:09:01,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 04:09:01,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:06,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 04:09:09,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:09:09,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:09:09,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:14,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 04:09:19,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:09:20,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 04:09:21,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:25,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:26,997 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 04:09:27,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 04:09:31,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=244280.0, ans=0.1 2023-09-29 04:09:34,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 04:09:36,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=244280.0, ans=0.125 2023-09-29 04:09:37,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:40,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:09:40,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=244346.66666666666, ans=0.0 2023-09-29 04:09:42,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:09:42,850 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 04:09:42,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:09:43,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=244346.66666666666, ans=0.125 2023-09-29 04:09:46,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:09:48,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:09:51,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 04:09:51,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 04:09:51,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:53,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:09:53,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:54,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:09:54,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 04:09:56,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=244413.33333333334, ans=0.125 2023-09-29 04:09:57,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 04:10:00,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:02,302 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.924e+02 2.114e+02 2.511e+02 3.995e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 04:10:02,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:10:02,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 04:10:02,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:04,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:06,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:10:07,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:08,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:10:10,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:11,390 INFO [train.py:1039] (2/4) Epoch 7, batch 4800, loss[loss=0.2137, simple_loss=0.2901, pruned_loss=0.06867, over 24664.00 frames. ], tot_loss[loss=0.2249, simple_loss=0.2896, pruned_loss=0.0801, over 4726169.74 frames. ], batch size: 65, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:10:11,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 04:10:11,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 04:10:13,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 04:10:17,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:10:19,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:20,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 04:10:26,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:27,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:32,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:10:32,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:32,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:33,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 04:10:34,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=244546.66666666666, ans=0.125 2023-09-29 04:10:35,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:35,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:10:35,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:10:41,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:10:42,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:42,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:10:44,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:44,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:10:44,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:44,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:48,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:51,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:10:55,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:10:57,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:57,971 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.38 vs. limit=6.0 2023-09-29 04:10:58,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 04:10:58,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 04:11:00,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:00,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:11:00,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:11:00,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:00,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:11:02,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:11:02,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:07,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:09,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:11,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:15,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 04:11:17,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:17,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:17,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:11:17,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:22,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:23,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:11:23,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:24,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:11:24,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:11:26,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:11:30,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:30,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:30,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:31,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 04:11:34,689 INFO [train.py:1039] (2/4) Epoch 7, batch 4850, loss[loss=0.2126, simple_loss=0.2854, pruned_loss=0.06993, over 24457.00 frames. ], tot_loss[loss=0.2257, simple_loss=0.2905, pruned_loss=0.08046, over 4727105.72 frames. ], batch size: 63, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:11:36,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 04:11:36,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:36,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:37,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:11:37,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:40,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:47,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=244813.33333333334, ans=0.125 2023-09-29 04:11:48,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 04:11:49,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:54,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:11:54,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:11:56,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:00,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:12:00,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:12:03,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:12:03,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 04:12:07,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:12:10,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:12:10,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:12:10,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:12:10,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 04:12:14,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:12:14,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:17,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:17,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 04:12:19,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 04:12:20,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:12:25,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:12:27,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 04:12:29,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:12:29,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:12:31,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:12:33,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 04:12:33,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:35,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 04:12:35,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:38,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:12:38,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 04:12:46,411 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.140e+02 2.410e+02 2.869e+02 4.952e+02, threshold=4.821e+02, percent-clipped=3.0 2023-09-29 04:12:46,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:52,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:12:52,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:12:55,927 INFO [train.py:1039] (2/4) Epoch 7, batch 4900, loss[loss=0.2318, simple_loss=0.2654, pruned_loss=0.09907, over 19263.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2893, pruned_loss=0.08037, over 4704844.91 frames. ], batch size: 388, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:12:57,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 04:12:57,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:13:04,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:04,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=245146.66666666666, ans=0.2 2023-09-29 04:13:05,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:05,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:13:09,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 04:13:15,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 04:13:19,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 04:13:20,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 04:13:20,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:22,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:22,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:13:22,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:22,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:13:22,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 04:13:25,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 04:13:25,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:13:28,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:13:28,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:29,427 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.62 vs. limit=15.0 2023-09-29 04:13:31,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:13:31,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:33,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:33,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 04:13:33,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:13:35,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:35,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 04:13:35,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 04:13:35,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=245280.0, ans=0.0 2023-09-29 04:13:40,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 04:13:42,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:13:44,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:13:44,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:13:45,434 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.08 vs. limit=15.0 2023-09-29 04:13:46,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:46,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:13:46,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:13:47,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 04:13:50,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:52,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:13:53,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:13:57,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 04:13:58,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:13:58,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:13:58,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 04:14:02,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=245413.33333333334, ans=0.125 2023-09-29 04:14:03,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:05,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:06,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 04:14:06,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:06,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:14:10,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:15,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:15,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:14:15,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:15,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 04:14:16,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:14:18,186 INFO [train.py:1039] (2/4) Epoch 7, batch 4950, loss[loss=0.217, simple_loss=0.2898, pruned_loss=0.07213, over 24675.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.2881, pruned_loss=0.07954, over 4709048.80 frames. ], batch size: 65, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:14:18,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:20,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:23,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 04:14:24,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 04:14:24,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:14:25,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 04:14:25,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:26,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:14:26,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:14:26,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:28,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:29,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:14:31,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:14:33,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:34,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:34,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:38,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:14:38,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=245546.66666666666, ans=0.125 2023-09-29 04:14:47,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:48,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:48,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=245546.66666666666, ans=0.0 2023-09-29 04:14:50,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:50,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:50,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=245613.33333333334, ans=0.2 2023-09-29 04:14:52,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:14:53,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 04:14:53,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 04:14:57,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:58,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:15:00,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:15:01,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:01,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:15:03,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:15:04,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:07,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:15:08,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:15:09,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:09,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:11,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 04:15:11,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:15:15,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:15:18,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:15:21,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:15:21,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:15:21,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:22,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:15:23,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:15:25,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:15:26,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:15:26,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:28,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 04:15:31,886 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.077e+02 2.324e+02 2.627e+02 6.143e+02, threshold=4.647e+02, percent-clipped=3.0 2023-09-29 04:15:32,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:15:39,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 04:15:39,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:15:40,887 INFO [train.py:1039] (2/4) Epoch 7, batch 5000, loss[loss=0.2457, simple_loss=0.3015, pruned_loss=0.09492, over 23400.00 frames. ], tot_loss[loss=0.2225, simple_loss=0.288, pruned_loss=0.07846, over 4728163.13 frames. ], batch size: 119, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:15:47,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:47,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:15:49,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 04:15:49,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 04:15:53,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:15:56,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 04:15:56,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:56,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:15:56,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 04:15:56,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:57,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:15:59,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 04:15:59,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:16:00,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:01,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 04:16:01,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 04:16:03,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:16:03,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 04:16:03,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:16:03,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=245880.0, ans=0.125 2023-09-29 04:16:04,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:04,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:16:04,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 04:16:04,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 04:16:05,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=245880.0, ans=0.2 2023-09-29 04:16:07,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 04:16:07,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:08,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:10,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 04:16:11,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.31 vs. limit=22.5 2023-09-29 04:16:11,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:16:11,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:13,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:16:16,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:16:19,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 04:16:19,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:16:21,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:16:24,568 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 04:16:28,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:16:28,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:28,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:16:33,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 04:16:34,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:34,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:34,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:16:36,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 04:16:37,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:40,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:42,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:16:50,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 04:16:53,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:02,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:03,838 INFO [train.py:1039] (2/4) Epoch 7, batch 5050, loss[loss=0.231, simple_loss=0.2901, pruned_loss=0.08596, over 23289.00 frames. ], tot_loss[loss=0.2231, simple_loss=0.2883, pruned_loss=0.07898, over 4726906.76 frames. ], batch size: 105, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:17:04,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:05,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:17:05,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:05,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:17:06,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:17:08,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:13,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:13,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 04:17:14,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:17:16,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:18,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:17:18,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 04:17:20,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:20,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:17:23,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:17:23,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:17:24,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:17:25,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=246213.33333333334, ans=0.0 2023-09-29 04:17:31,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 04:17:31,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:17:33,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:33,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 04:17:33,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:17:35,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:37,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:37,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:17:37,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 04:17:38,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.61 vs. limit=15.0 2023-09-29 04:17:38,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 04:17:40,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:42,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:17:42,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=246280.0, ans=0.125 2023-09-29 04:17:45,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:47,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 04:17:48,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:17:50,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 04:17:52,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:17:52,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:17:52,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:53,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:56,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:17:57,792 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.97 vs. limit=6.0 2023-09-29 04:17:58,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:17:59,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:59,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:59,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:17:59,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 04:17:59,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:18:03,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:18:06,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:18:06,590 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 04:18:06,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:18:08,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:10,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:10,329 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 04:18:13,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:13,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 04:18:13,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:18,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:18,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:18,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 04:18:19,881 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 2.259e+02 2.586e+02 3.154e+02 5.284e+02, threshold=5.172e+02, percent-clipped=3.0 2023-09-29 04:18:20,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 04:18:24,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:24,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:18:24,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:18:26,483 INFO [train.py:1039] (2/4) Epoch 7, batch 5100, loss[loss=0.2505, simple_loss=0.3043, pruned_loss=0.0984, over 23826.00 frames. ], tot_loss[loss=0.224, simple_loss=0.2892, pruned_loss=0.07945, over 4719424.36 frames. ], batch size: 164, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:18:26,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=246480.0, ans=0.0 2023-09-29 04:18:28,062 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 04:18:31,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:35,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 04:18:35,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 04:18:35,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:35,969 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:18:37,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:18:42,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:42,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 04:18:42,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 04:18:47,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:47,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:18:53,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:57,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 04:18:57,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:00,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:19:00,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 04:19:02,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 04:19:07,195 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 04:19:07,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:07,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 04:19:07,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 04:19:07,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=246613.33333333334, ans=0.125 2023-09-29 04:19:08,070 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.52 vs. limit=15.0 2023-09-29 04:19:12,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:17,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=246680.0, ans=0.0 2023-09-29 04:19:22,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:25,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 04:19:27,563 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 04:19:27,575 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 04:19:29,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 04:19:29,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:32,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 04:19:35,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 04:19:37,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:19:38,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:19:40,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 04:19:42,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=246746.66666666666, ans=0.0 2023-09-29 04:19:43,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:19:45,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 04:19:48,707 INFO [train.py:1039] (2/4) Epoch 7, batch 5150, loss[loss=0.2481, simple_loss=0.2994, pruned_loss=0.09845, over 23752.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2907, pruned_loss=0.08107, over 4709205.23 frames. ], batch size: 164, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:19:50,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:19:50,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:19:50,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:19:51,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:19:51,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:19:53,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:19:55,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 04:19:55,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 04:19:55,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 04:19:55,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:19:55,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 04:19:58,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:58,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:19:58,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:01,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:07,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:20:07,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 04:20:08,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:09,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:20:11,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:20:11,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:11,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:12,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:20:12,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:20:12,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 04:20:14,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:20:14,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:15,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:20:18,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 04:20:18,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=246880.0, ans=0.1 2023-09-29 04:20:19,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:20:26,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:20:29,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 04:20:32,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:33,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=246946.66666666666, ans=0.125 2023-09-29 04:20:39,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:40,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:43,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:44,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:47,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 04:20:49,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:50,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:20:50,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:55,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:55,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:55,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 04:21:00,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:03,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:21:05,020 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.008e+02 2.205e+02 2.538e+02 3.618e+02, threshold=4.410e+02, percent-clipped=0.0 2023-09-29 04:21:05,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:21:05,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:21:06,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:21:06,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:21:06,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:21:06,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:21:10,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:21:11,424 INFO [train.py:1039] (2/4) Epoch 7, batch 5200, loss[loss=0.218, simple_loss=0.2991, pruned_loss=0.06845, over 24416.00 frames. ], tot_loss[loss=0.2271, simple_loss=0.2911, pruned_loss=0.08155, over 4703015.26 frames. ], batch size: 69, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:21:11,753 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:21:12,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:21:14,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:19,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 04:21:21,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:21:22,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:23,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=247146.66666666666, ans=0.09899494936611666 2023-09-29 04:21:25,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:27,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:21:27,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:27,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=247213.33333333334, ans=0.05 2023-09-29 04:21:30,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 04:21:33,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:21:33,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=247213.33333333334, ans=0.125 2023-09-29 04:21:35,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:37,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 04:21:38,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:21:41,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:21:42,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 04:21:42,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 04:21:45,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 04:21:47,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:47,075 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 04:21:47,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:48,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:49,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:21:50,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 04:21:50,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:21:53,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:56,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=247280.0, ans=0.125 2023-09-29 04:21:57,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 04:21:57,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 04:21:57,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 04:22:01,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 04:22:03,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:22:09,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:22:09,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:11,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 04:22:12,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:22:13,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:22:13,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:13,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:18,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:18,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:22:21,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:22:22,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:22,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:27,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:28,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 04:22:30,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:30,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:22:30,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=247480.0, ans=0.025 2023-09-29 04:22:31,762 INFO [train.py:1039] (2/4) Epoch 7, batch 5250, loss[loss=0.2303, simple_loss=0.292, pruned_loss=0.08434, over 23591.00 frames. ], tot_loss[loss=0.2263, simple_loss=0.2898, pruned_loss=0.08143, over 4697404.52 frames. ], batch size: 106, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:22:31,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:31,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:22:34,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:22:35,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:22:40,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:40,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:22:41,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:22:44,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=247480.0, ans=0.125 2023-09-29 04:22:46,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=247480.0, ans=0.125 2023-09-29 04:22:48,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:50,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:22:50,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=247546.66666666666, ans=0.0 2023-09-29 04:22:51,338 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.21 vs. limit=22.5 2023-09-29 04:22:52,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:22:53,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:55,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 04:22:55,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:57,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:23:21,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=247680.0, ans=0.125 2023-09-29 04:23:34,519 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.34 vs. limit=6.0 2023-09-29 04:23:34,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.49 vs. limit=15.0 2023-09-29 04:23:41,160 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.140e+02 2.318e+02 2.697e+02 3.802e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 04:23:44,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=247746.66666666666, ans=0.0 2023-09-29 04:23:47,120 INFO [train.py:1039] (2/4) Epoch 7, batch 5300, loss[loss=0.2277, simple_loss=0.2773, pruned_loss=0.08904, over 23815.00 frames. ], tot_loss[loss=0.2245, simple_loss=0.2877, pruned_loss=0.08065, over 4682541.96 frames. ], batch size: 212, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:24:01,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:24:01,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 04:24:01,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 04:24:02,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:02,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:02,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:02,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:02,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:02,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:02,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:02,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:24:03,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:24:03,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 04:24:03,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 04:24:03,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 04:24:03,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:24:03,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 04:24:04,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 04:24:04,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:05,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:05,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:05,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:05,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:24:06,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:06,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:06,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:06,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:06,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:06,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:24:06,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:06,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:24:07,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 04:24:07,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:07,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:07,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 04:24:07,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 04:24:08,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:24:08,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:08,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 04:24:09,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 04:24:09,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:09,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:24:10,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:10,152 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 04:24:10,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 04:24:10,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:24:10,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:10,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 04:24:10,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 04:24:10,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 04:24:11,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:19,237 INFO [train.py:1039] (2/4) Epoch 8, batch 0, loss[loss=0.2361, simple_loss=0.2919, pruned_loss=0.09019, over 23325.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.2919, pruned_loss=0.09019, over 23325.00 frames. ], batch size: 105, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:24:19,237 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 04:24:33,511 INFO [train.py:1071] (2/4) Epoch 8, validation: loss=0.2869, simple_loss=0.2985, pruned_loss=0.1377, over 1125622.00 frames. 2023-09-29 04:24:33,512 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 04:24:33,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 04:24:34,407 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.25 vs. limit=15.0 2023-09-29 04:24:35,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:24:36,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:24:41,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:41,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:24:41,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:42,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 04:24:44,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 04:24:47,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:49,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:55,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:24:55,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:24:57,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 04:25:01,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:25:10,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:25:10,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:11,996 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:25:13,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 04:25:15,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.42 vs. limit=15.0 2023-09-29 04:25:17,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:25:17,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:25:19,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:24,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:25:30,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:35,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 04:25:39,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 04:25:39,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:25:39,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:40,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:25:42,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:46,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 04:25:46,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-09-29 04:25:47,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:49,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:52,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:25:55,329 INFO [train.py:1039] (2/4) Epoch 8, batch 50, loss[loss=0.2197, simple_loss=0.2739, pruned_loss=0.08278, over 23824.00 frames. ], tot_loss[loss=0.2222, simple_loss=0.2888, pruned_loss=0.07783, over 1072371.85 frames. ], batch size: 164, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:25:55,420 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 04:25:55,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:25:58,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=248226.66666666666, ans=0.0 2023-09-29 04:26:00,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:01,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:01,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 04:26:03,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:26:03,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:26:06,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:07,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:09,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:12,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 04:26:14,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:23,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:26:24,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 04:26:26,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 04:26:27,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:26:29,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:26:29,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:31,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:26:31,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:26:32,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:26:32,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:40,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:42,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:26:42,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:26:42,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 04:26:45,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:26:46,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:26:46,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 04:26:48,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:48,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 04:26:50,454 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.737e+02 2.177e+02 2.443e+02 2.821e+02 4.431e+02, threshold=4.886e+02, percent-clipped=0.0 2023-09-29 04:26:56,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:26:57,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:58,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:00,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:00,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:04,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 04:27:04,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 04:27:05,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:05,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:07,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:27:07,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:27:07,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 04:27:08,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 04:27:10,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:27:12,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:12,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:27:13,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 04:27:13,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 04:27:13,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:15,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:16,877 INFO [train.py:1039] (2/4) Epoch 8, batch 100, loss[loss=0.2391, simple_loss=0.2913, pruned_loss=0.09342, over 23579.00 frames. ], tot_loss[loss=0.2232, simple_loss=0.2896, pruned_loss=0.07844, over 1882360.51 frames. ], batch size: 256, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:27:16,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:27:16,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:27:18,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:27:23,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:27:25,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=248560.0, ans=0.1 2023-09-29 04:27:26,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:30,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 04:27:30,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:34,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:27:34,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:34,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:34,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:34,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:36,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 04:27:38,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:27:40,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:40,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:40,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:40,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=248626.66666666666, ans=0.0 2023-09-29 04:27:44,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 04:27:44,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:46,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:48,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:27:49,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:27:52,880 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 04:27:52,906 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 04:27:54,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:27:54,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:27:59,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:28:01,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:01,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:02,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=248693.33333333334, ans=0.0 2023-09-29 04:28:06,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=248760.0, ans=0.0 2023-09-29 04:28:07,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:08,960 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 04:28:10,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:28:13,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:15,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:28:18,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:20,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:23,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:24,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:28:29,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:29,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:31,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:31,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:28:31,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:32,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 04:28:33,013 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 04:28:33,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:33,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:28:34,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:34,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:34,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:28:34,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:28:35,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:28:35,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:36,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:37,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=248826.66666666666, ans=0.2 2023-09-29 04:28:38,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:40,408 INFO [train.py:1039] (2/4) Epoch 8, batch 150, loss[loss=0.2426, simple_loss=0.2917, pruned_loss=0.09672, over 23701.00 frames. ], tot_loss[loss=0.2231, simple_loss=0.2889, pruned_loss=0.0787, over 2525251.17 frames. ], batch size: 232, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:28:40,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:28:40,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:28:43,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:46,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:46,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:28:46,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:48,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:50,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:51,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:51,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:55,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 04:28:56,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 04:28:56,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 04:28:59,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:59,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:29:01,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:29:04,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:29:04,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:05,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:06,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:08,212 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 04:29:09,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=248960.0, ans=0.125 2023-09-29 04:29:11,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:15,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:20,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:29:20,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 04:29:20,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=249026.66666666666, ans=0.125 2023-09-29 04:29:22,736 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.50 vs. limit=22.5 2023-09-29 04:29:24,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:29:24,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:24,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:26,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:29:27,362 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.45 vs. limit=15.0 2023-09-29 04:29:28,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:29:30,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:29:31,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:31,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 04:29:33,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=249093.33333333334, ans=0.125 2023-09-29 04:29:36,131 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.113e+02 2.401e+02 2.733e+02 5.079e+02, threshold=4.803e+02, percent-clipped=1.0 2023-09-29 04:29:38,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:40,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:29:40,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:29:40,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:29:44,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:45,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 04:29:48,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:29:50,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:29:52,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:29:54,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=249160.0, ans=0.125 2023-09-29 04:29:55,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:29:55,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 04:29:55,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:55,505 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 04:29:55,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=249160.0, ans=0.0 2023-09-29 04:29:58,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:03,308 INFO [train.py:1039] (2/4) Epoch 8, batch 200, loss[loss=0.2552, simple_loss=0.3193, pruned_loss=0.09551, over 23241.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2899, pruned_loss=0.07962, over 3014788.08 frames. ], batch size: 105, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:30:03,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:30:03,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:30:05,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 04:30:07,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:07,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:07,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=249226.66666666666, ans=0.125 2023-09-29 04:30:10,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 04:30:11,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:30:12,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=249226.66666666666, ans=0.1 2023-09-29 04:30:13,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:15,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:18,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:30:18,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:18,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:43,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:30:43,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:30:44,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:30:45,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=249360.0, ans=0.025 2023-09-29 04:30:46,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:30:46,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:30:46,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:30:47,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:49,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:30:49,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:49,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:30:51,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 04:30:53,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:30:53,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:57,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=8.21 vs. limit=12.0 2023-09-29 04:30:58,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:31:01,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=249426.66666666666, ans=0.125 2023-09-29 04:31:03,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:31:07,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=249426.66666666666, ans=0.0 2023-09-29 04:31:09,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:09,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:31:16,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:19,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 04:31:21,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:21,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:31:21,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:23,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:31:24,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 04:31:25,901 INFO [train.py:1039] (2/4) Epoch 8, batch 250, loss[loss=0.2229, simple_loss=0.3023, pruned_loss=0.07181, over 24665.00 frames. ], tot_loss[loss=0.225, simple_loss=0.2898, pruned_loss=0.08016, over 3390390.72 frames. ], batch size: 73, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:31:25,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:31:26,010 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 04:31:28,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:29,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:31:32,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:32,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:34,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:31:36,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:36,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:31:39,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:31:50,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:31:55,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:55,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:32:03,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:32:03,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:32:03,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=249693.33333333334, ans=0.125 2023-09-29 04:32:04,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:32:05,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:05,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:32:05,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:32:05,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:10,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:32:13,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 04:32:13,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:32:16,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:32:16,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:32:16,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:32:17,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:17,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:32:17,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:32:20,838 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.201e+02 2.590e+02 2.939e+02 4.400e+02, threshold=5.181e+02, percent-clipped=0.0 2023-09-29 04:32:20,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:22,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:32:22,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:27,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:32:33,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:36,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:32:41,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:42,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:32:46,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 04:32:47,795 INFO [train.py:1039] (2/4) Epoch 8, batch 300, loss[loss=0.2142, simple_loss=0.2842, pruned_loss=0.07205, over 24612.00 frames. ], tot_loss[loss=0.2218, simple_loss=0.2868, pruned_loss=0.07843, over 3688958.84 frames. ], batch size: 60, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:32:47,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:32:47,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:49,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 04:32:50,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:32:51,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:32:51,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 04:32:51,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=249893.33333333334, ans=0.125 2023-09-29 04:32:52,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=249893.33333333334, ans=0.125 2023-09-29 04:32:55,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:55,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:33:00,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:33:00,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 04:33:02,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:33:04,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:33:04,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 04:33:04,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:08,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:33:14,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:33:16,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 04:33:19,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 04:33:19,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:20,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:22,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:22,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 04:33:22,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:33:25,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:33:27,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:33:28,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:33:33,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:33:33,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 04:33:33,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:33:36,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:38,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 04:33:38,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=250093.33333333334, ans=0.125 2023-09-29 04:33:40,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:33:45,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:33:49,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:33:49,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 04:33:54,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:54,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:33:56,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:57,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:33:57,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 04:33:57,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:33:59,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:01,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 04:34:01,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:34:02,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:04,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:04,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:05,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:07,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=250160.0, ans=0.1 2023-09-29 04:34:08,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=250226.66666666666, ans=0.125 2023-09-29 04:34:10,634 INFO [train.py:1039] (2/4) Epoch 8, batch 350, loss[loss=0.2011, simple_loss=0.2579, pruned_loss=0.07216, over 23566.00 frames. ], tot_loss[loss=0.2212, simple_loss=0.2862, pruned_loss=0.07811, over 3916782.08 frames. ], batch size: 149, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:34:11,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.33 vs. limit=15.0 2023-09-29 04:34:12,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:12,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:34:13,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.65 vs. limit=15.0 2023-09-29 04:34:15,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:21,512 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:34:22,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:25,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:26,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:29,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 04:34:30,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:30,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 04:34:33,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:33,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 04:34:35,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:37,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 04:34:38,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:34:40,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:41,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:34:43,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:43,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:45,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:34:45,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:45,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:34:46,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:34:46,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:55,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:34:55,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:34:55,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:34:55,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:59,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 04:34:59,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:35:05,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:05,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:05,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:35:07,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 04:35:08,741 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.305e+02 2.882e+02 6.292e+02, threshold=4.610e+02, percent-clipped=1.0 2023-09-29 04:35:08,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:10,460 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 04:35:12,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 04:35:12,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:15,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:35:15,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 04:35:18,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:22,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:35:22,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:24,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:24,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:27,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:31,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:35:34,509 INFO [train.py:1039] (2/4) Epoch 8, batch 400, loss[loss=0.2535, simple_loss=0.3079, pruned_loss=0.0996, over 23831.00 frames. ], tot_loss[loss=0.2204, simple_loss=0.2857, pruned_loss=0.07759, over 4102423.26 frames. ], batch size: 164, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:35:34,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:35:34,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 04:35:36,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:36,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:36,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:35:37,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:39,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:39,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:41,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 04:35:43,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 04:35:43,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:44,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 04:35:46,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:49,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:35:49,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:49,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 04:35:49,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:35:49,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:51,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:51,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:55,229 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 04:35:57,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 04:36:02,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=250626.66666666666, ans=0.2 2023-09-29 04:36:03,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:03,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:05,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 04:36:07,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 04:36:09,826 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.65 vs. limit=6.0 2023-09-29 04:36:10,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:36:12,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:19,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 04:36:22,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:36:24,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 04:36:27,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:36:27,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:36:27,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=250760.0, ans=0.09899494936611666 2023-09-29 04:36:29,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 04:36:33,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:36:36,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:36:39,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:43,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:44,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 04:36:44,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=250826.66666666666, ans=0.1 2023-09-29 04:36:46,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:36:47,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 04:36:50,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:36:50,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:36:52,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 04:36:55,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:36:55,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:36:55,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:36:56,768 INFO [train.py:1039] (2/4) Epoch 8, batch 450, loss[loss=0.2239, simple_loss=0.2907, pruned_loss=0.07853, over 23765.00 frames. ], tot_loss[loss=0.2214, simple_loss=0.2869, pruned_loss=0.07795, over 4234120.37 frames. ], batch size: 85, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:36:56,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 04:36:57,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:36:58,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:59,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:36:59,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 04:36:59,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:37:01,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:37:03,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:37:16,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:16,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:18,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 04:37:18,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 04:37:23,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:37:26,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:28,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:31,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:35,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 04:37:35,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 04:37:38,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 04:37:38,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:37:38,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:41,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:37:43,885 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 04:37:43,900 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 04:37:45,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:47,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:37:48,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=251093.33333333334, ans=0.0 2023-09-29 04:37:49,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:37:52,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:37:52,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:37:54,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:37:54,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 04:37:56,016 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.144e+02 2.402e+02 2.848e+02 5.479e+02, threshold=4.804e+02, percent-clipped=2.0 2023-09-29 04:37:57,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:59,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:37:59,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:37:59,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 04:38:04,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:38:04,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 04:38:05,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 04:38:07,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:38:13,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:38:14,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:17,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:38:18,585 INFO [train.py:1039] (2/4) Epoch 8, batch 500, loss[loss=0.2576, simple_loss=0.3058, pruned_loss=0.1047, over 23558.00 frames. ], tot_loss[loss=0.2223, simple_loss=0.2875, pruned_loss=0.07848, over 4346722.91 frames. ], batch size: 256, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:38:18,659 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 04:38:22,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:24,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:38:24,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:25,493 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 04:38:25,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 04:38:25,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:29,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:38:29,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=251226.66666666666, ans=0.0 2023-09-29 04:38:32,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:38:34,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.90 vs. limit=15.0 2023-09-29 04:38:35,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:38:37,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:38,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:39,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:38:40,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=251293.33333333334, ans=0.125 2023-09-29 04:38:49,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:49,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:38:50,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:38:50,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:50,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 04:38:50,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:38:54,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:38:56,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:38:56,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:38:56,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:58,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 04:39:00,692 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 04:39:02,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=251360.0, ans=0.125 2023-09-29 04:39:03,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:05,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:06,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:06,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:07,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:39:09,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 04:39:13,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:39:14,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:17,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:19,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:22,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=251493.33333333334, ans=0.1 2023-09-29 04:39:26,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:28,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 04:39:28,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:28,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:32,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 04:39:34,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:39:35,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:41,372 INFO [train.py:1039] (2/4) Epoch 8, batch 550, loss[loss=0.2299, simple_loss=0.2876, pruned_loss=0.08613, over 23574.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2895, pruned_loss=0.08006, over 4417176.02 frames. ], batch size: 256, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:39:41,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 04:39:43,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 04:39:43,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:43,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 04:39:44,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:39:44,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:46,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:39:49,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:39:50,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:52,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 04:39:52,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:39:53,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=251560.0, ans=0.125 2023-09-29 04:39:58,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:39:58,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:00,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:02,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:03,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=251626.66666666666, ans=0.1 2023-09-29 04:40:08,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 04:40:10,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 04:40:10,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=251626.66666666666, ans=0.2 2023-09-29 04:40:11,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:40:15,732 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.64 vs. limit=15.0 2023-09-29 04:40:16,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:40:16,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:16,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:40:20,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:20,949 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 04:40:22,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:23,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:40:25,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:25,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:40:27,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:40:27,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:28,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 04:40:30,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 04:40:30,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:32,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:33,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:40:33,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:40:33,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=251760.0, ans=0.125 2023-09-29 04:40:34,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=251760.0, ans=0.1 2023-09-29 04:40:37,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:40:37,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:40:38,978 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.030e+02 2.358e+02 2.809e+02 4.445e+02, threshold=4.716e+02, percent-clipped=0.0 2023-09-29 04:40:39,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:40:41,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:42,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:40:43,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:40:44,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:46,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:40:46,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:47,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=251826.66666666666, ans=0.1 2023-09-29 04:40:48,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:40:49,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:40:55,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 04:40:58,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 04:40:59,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:40:59,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:41:01,243 INFO [train.py:1039] (2/4) Epoch 8, batch 600, loss[loss=0.2181, simple_loss=0.2663, pruned_loss=0.08494, over 23426.00 frames. ], tot_loss[loss=0.224, simple_loss=0.2888, pruned_loss=0.07954, over 4494738.55 frames. ], batch size: 285, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:41:01,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:08,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:41:12,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:41:14,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 04:41:15,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:41:18,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:21,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:24,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 04:41:24,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:41:26,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=251960.0, ans=0.125 2023-09-29 04:41:30,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 04:41:33,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:41:33,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:33,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:41:40,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:41:40,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:41:43,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:51,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:41:55,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=252093.33333333334, ans=0.125 2023-09-29 04:41:56,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:56,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:56,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:42:02,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 04:42:07,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:42:08,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:12,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 04:42:12,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:42:15,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 04:42:15,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:42:16,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:42:16,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=15.0 2023-09-29 04:42:22,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:42:23,009 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.69 vs. limit=12.0 2023-09-29 04:42:25,709 INFO [train.py:1039] (2/4) Epoch 8, batch 650, loss[loss=0.2102, simple_loss=0.2619, pruned_loss=0.07928, over 23827.00 frames. ], tot_loss[loss=0.2233, simple_loss=0.2878, pruned_loss=0.07934, over 4544672.00 frames. ], batch size: 164, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:42:25,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:42:26,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=252226.66666666666, ans=0.125 2023-09-29 04:42:28,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:42:29,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:42:31,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:34,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 04:42:34,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=252226.66666666666, ans=0.0 2023-09-29 04:42:35,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:42:40,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:42:40,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:43,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:47,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 04:42:48,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:42:48,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:52,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:54,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:42:55,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:57,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:59,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:42:59,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:02,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:43:05,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:43:05,735 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 04:43:05,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:05,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:10,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:11,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:11,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:11,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:43:12,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=252360.0, ans=0.125 2023-09-29 04:43:13,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 04:43:14,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:43:14,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:43:16,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:43:16,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:17,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:43:19,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 04:43:20,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 04:43:20,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:20,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:21,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:43:21,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:43:23,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:43:26,225 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 2.104e+02 2.347e+02 2.945e+02 4.272e+02, threshold=4.693e+02, percent-clipped=0.0 2023-09-29 04:43:31,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:32,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:43:34,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:36,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:37,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:43:38,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:46,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:43:46,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:47,889 INFO [train.py:1039] (2/4) Epoch 8, batch 700, loss[loss=0.2342, simple_loss=0.3004, pruned_loss=0.08398, over 23359.00 frames. ], tot_loss[loss=0.222, simple_loss=0.287, pruned_loss=0.0785, over 4585146.44 frames. ], batch size: 93, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:43:47,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:43:48,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:51,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=252560.0, ans=0.125 2023-09-29 04:43:52,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 04:43:52,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 04:43:54,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 04:43:56,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:57,454 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.24 vs. limit=15.0 2023-09-29 04:43:59,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:44:01,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 04:44:06,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:09,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:44:11,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:13,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:44:13,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:44:16,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:19,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 04:44:19,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:44:21,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 04:44:22,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 04:44:23,102 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:44:25,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:44:26,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:44:27,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:44:30,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=252693.33333333334, ans=0.0 2023-09-29 04:44:32,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:44:34,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 04:44:38,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:38,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:44:38,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 04:44:43,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:45,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:48,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:44:53,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:44:53,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 04:44:56,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 04:44:56,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 04:44:59,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:03,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:04,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:05,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:06,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 04:45:11,566 INFO [train.py:1039] (2/4) Epoch 8, batch 750, loss[loss=0.2353, simple_loss=0.2718, pruned_loss=0.09945, over 19537.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2851, pruned_loss=0.07778, over 4597227.56 frames. ], batch size: 388, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:45:11,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 04:45:11,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 04:45:11,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 04:45:13,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 04:45:13,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 04:45:14,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:45:16,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 04:45:17,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:17,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:19,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:21,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:21,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:45:21,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:24,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:45:26,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:45:28,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:45:32,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:33,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:33,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 04:45:35,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:45:36,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.45 vs. limit=15.0 2023-09-29 04:45:36,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:37,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:39,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:45:40,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 04:45:40,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:42,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 04:45:44,035 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 04:45:45,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 04:45:45,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:45:45,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:45:47,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:45:55,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:55,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:45:55,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:45:58,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:58,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:00,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 04:46:00,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:46:01,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:46:03,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:46:07,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=253093.33333333334, ans=0.0 2023-09-29 04:46:08,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:46:08,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 04:46:08,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:11,305 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 2.056e+02 2.287e+02 2.694e+02 4.439e+02, threshold=4.575e+02, percent-clipped=0.0 2023-09-29 04:46:13,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:13,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:46:14,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.15 vs. limit=15.0 2023-09-29 04:46:15,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:18,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:46:22,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 04:46:22,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=253160.0, ans=0.125 2023-09-29 04:46:24,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:24,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:31,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:32,915 INFO [train.py:1039] (2/4) Epoch 8, batch 800, loss[loss=0.2319, simple_loss=0.2857, pruned_loss=0.089, over 23788.00 frames. ], tot_loss[loss=0.2204, simple_loss=0.2857, pruned_loss=0.07759, over 4634001.39 frames. ], batch size: 164, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:46:32,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:46:42,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:42,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:44,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:44,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:46,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:46,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:47,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:53,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:54,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:46:56,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 04:46:56,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:59,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:59,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:46:59,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:46:59,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 04:46:59,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:01,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 04:47:04,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:07,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:08,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:47:08,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:47:12,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=253360.0, ans=0.1 2023-09-29 04:47:13,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:13,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:17,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:47:18,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:47:18,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:47:20,147 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 04:47:21,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 04:47:21,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:47:21,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:47:23,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:23,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:47:28,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 04:47:29,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 04:47:31,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:47:33,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:47:36,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:47:40,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:41,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 04:47:42,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:47:43,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=253493.33333333334, ans=0.125 2023-09-29 04:47:44,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 04:47:52,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:47:56,237 INFO [train.py:1039] (2/4) Epoch 8, batch 850, loss[loss=0.2091, simple_loss=0.2904, pruned_loss=0.06394, over 24646.00 frames. ], tot_loss[loss=0.2212, simple_loss=0.2869, pruned_loss=0.0777, over 4665782.89 frames. ], batch size: 68, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:47:56,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:47:56,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 04:47:57,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:47:59,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:59,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 04:48:00,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:02,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:48:03,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:05,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:48:05,971 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.44 vs. limit=15.0 2023-09-29 04:48:06,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:48:08,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 04:48:08,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 04:48:08,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 04:48:10,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:48:10,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:48:12,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:14,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:48:14,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:48:17,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=253626.66666666666, ans=0.125 2023-09-29 04:48:20,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:20,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:20,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 04:48:24,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 04:48:26,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:28,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 04:48:32,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 04:48:34,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 04:48:37,103 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 04:48:37,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:37,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:48:37,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:48:40,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 04:48:43,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:45,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:47,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:48:47,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:48:50,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:48:52,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:48:52,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 04:48:55,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=253760.0, ans=0.0 2023-09-29 04:48:56,836 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.108e+02 2.249e+02 2.560e+02 3.769e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 04:48:57,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:48:57,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:48:58,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:48:58,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:49:00,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:00,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=253826.66666666666, ans=0.125 2023-09-29 04:49:01,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:49:02,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=253826.66666666666, ans=0.0 2023-09-29 04:49:05,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:49:06,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:49:06,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=253826.66666666666, ans=0.1 2023-09-29 04:49:08,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:08,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:49:17,908 INFO [train.py:1039] (2/4) Epoch 8, batch 900, loss[loss=0.187, simple_loss=0.263, pruned_loss=0.05553, over 24576.00 frames. ], tot_loss[loss=0.2218, simple_loss=0.2873, pruned_loss=0.07816, over 4672235.17 frames. ], batch size: 60, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:49:18,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:49:20,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:49:20,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 04:49:20,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:20,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:49:21,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 04:49:27,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=253893.33333333334, ans=0.0 2023-09-29 04:49:27,806 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.34 vs. limit=15.0 2023-09-29 04:49:28,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:49:33,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:33,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 04:49:36,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:49:36,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 04:49:38,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:49:40,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:40,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:40,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:49:40,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:49:42,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.62 vs. limit=15.0 2023-09-29 04:49:47,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=253960.0, ans=0.125 2023-09-29 04:49:48,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=253960.0, ans=0.0 2023-09-29 04:49:50,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:50,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:50,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:49:52,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:58,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 04:50:00,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:50:04,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:50:04,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:50:05,777 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 04:50:05,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 04:50:15,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:50:15,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:50:15,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:50:15,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=254093.33333333334, ans=0.0 2023-09-29 04:50:19,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.50 vs. limit=6.0 2023-09-29 04:50:21,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:21,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:50:23,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=254160.0, ans=0.125 2023-09-29 04:50:25,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 04:50:25,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:50:28,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 04:50:29,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:50:31,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:31,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:50:31,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:50:36,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 04:50:36,822 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 04:50:38,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:50:38,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 04:50:38,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=254160.0, ans=0.125 2023-09-29 04:50:38,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=254160.0, ans=0.125 2023-09-29 04:50:41,192 INFO [train.py:1039] (2/4) Epoch 8, batch 950, loss[loss=0.2086, simple_loss=0.2864, pruned_loss=0.06542, over 24460.00 frames. ], tot_loss[loss=0.2221, simple_loss=0.2877, pruned_loss=0.07826, over 4681339.25 frames. ], batch size: 66, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:50:41,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:46,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 04:50:51,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:50:53,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:53,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:55,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:50:56,835 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 04:51:01,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:02,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:02,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=254293.33333333334, ans=0.125 2023-09-29 04:51:03,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:03,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=254293.33333333334, ans=0.0 2023-09-29 04:51:04,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:51:04,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 04:51:04,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:51:07,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:08,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 04:51:08,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:11,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:12,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:13,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:51:13,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 04:51:15,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:51:18,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:20,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:51:24,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:51:24,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:25,782 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.31 vs. limit=15.0 2023-09-29 04:51:27,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 04:51:29,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 04:51:29,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:51:29,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:31,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:31,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:51:37,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 04:51:39,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:51:41,994 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.975e+02 2.273e+02 2.583e+02 4.078e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 04:51:42,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:42,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:42,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 04:51:42,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:42,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:51:42,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 04:51:48,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:51:52,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:59,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:51:59,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 04:51:59,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=254493.33333333334, ans=0.0 2023-09-29 04:52:01,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 04:52:03,997 INFO [train.py:1039] (2/4) Epoch 8, batch 1000, loss[loss=0.207, simple_loss=0.2633, pruned_loss=0.07537, over 23683.00 frames. ], tot_loss[loss=0.2212, simple_loss=0.286, pruned_loss=0.0782, over 4690777.78 frames. ], batch size: 232, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:52:04,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:52:07,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 04:52:09,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:15,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:52:16,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 04:52:16,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 04:52:22,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:22,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:52:23,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:27,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 04:52:31,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 04:52:32,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 04:52:32,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:34,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 04:52:34,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=254626.66666666666, ans=0.125 2023-09-29 04:52:37,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 04:52:37,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 04:52:39,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:40,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:48,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:50,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:52:50,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:51,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:51,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 04:52:52,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:52,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=254760.0, ans=0.125 2023-09-29 04:52:53,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:52:53,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:53,711 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 04:52:58,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 04:52:58,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 04:53:00,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 04:53:02,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:53:08,301 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-29 04:53:09,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:09,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:53:10,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:10,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:53:14,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 04:53:15,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:53:15,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 04:53:17,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 04:53:17,964 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.80 vs. limit=10.0 2023-09-29 04:53:19,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:19,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:53:22,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:53:22,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:53:22,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.30 vs. limit=12.0 2023-09-29 04:53:25,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:53:27,361 INFO [train.py:1039] (2/4) Epoch 8, batch 1050, loss[loss=0.1954, simple_loss=0.264, pruned_loss=0.06342, over 24604.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2854, pruned_loss=0.07758, over 4699191.40 frames. ], batch size: 60, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:53:30,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:53:30,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:53:30,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=254893.33333333334, ans=0.2 2023-09-29 04:53:30,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=254893.33333333334, ans=0.125 2023-09-29 04:53:32,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:53:33,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:38,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:53:39,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:53:41,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:53:42,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:53:44,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:53:44,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:53:44,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:53:46,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 04:53:47,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:53:47,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 04:53:50,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:50,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 04:53:50,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:53:57,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:59,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:54:00,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:54:01,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=255026.66666666666, ans=0.1 2023-09-29 04:54:03,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 04:54:03,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 04:54:03,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:54:07,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 04:54:10,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 04:54:12,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:14,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=255026.66666666666, ans=0.2 2023-09-29 04:54:15,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:54:19,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 04:54:19,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:54:19,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:54:22,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:54:27,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 04:54:28,825 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.059e+02 2.267e+02 2.771e+02 5.438e+02, threshold=4.534e+02, percent-clipped=2.0 2023-09-29 04:54:29,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 04:54:29,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 04:54:29,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=255093.33333333334, ans=0.0 2023-09-29 04:54:30,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:30,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:54:32,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 04:54:37,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:54:38,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:38,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:54:38,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:38,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:39,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=255160.0, ans=0.125 2023-09-29 04:54:44,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:44,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 04:54:46,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:46,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 04:54:47,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 04:54:47,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:54:50,436 INFO [train.py:1039] (2/4) Epoch 8, batch 1100, loss[loss=0.2198, simple_loss=0.2857, pruned_loss=0.07693, over 23537.00 frames. ], tot_loss[loss=0.2197, simple_loss=0.2851, pruned_loss=0.07717, over 4701584.49 frames. ], batch size: 149, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:54:52,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:54:56,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:55:00,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:55:03,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:55:03,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:03,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 04:55:05,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:55:05,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=255293.33333333334, ans=0.1 2023-09-29 04:55:08,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:55:10,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:55:13,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:55:15,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 04:55:15,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:55:17,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:17,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:55:20,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:55:23,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:55:29,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:55:32,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 04:55:34,449 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 04:55:35,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:37,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:55:40,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:55:41,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=255426.66666666666, ans=0.125 2023-09-29 04:55:42,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 04:55:43,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:55:43,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:55:43,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:55:44,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:45,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 04:55:50,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:55:50,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 04:55:54,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:55:56,551 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.12 vs. limit=22.5 2023-09-29 04:55:59,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:55:59,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=255493.33333333334, ans=0.2 2023-09-29 04:56:01,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 04:56:01,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:56:02,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:05,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:05,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:07,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 04:56:08,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:56:08,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:08,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=255493.33333333334, ans=0.2 2023-09-29 04:56:10,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 04:56:10,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:56:10,675 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.07 vs. limit=15.0 2023-09-29 04:56:11,507 INFO [train.py:1039] (2/4) Epoch 8, batch 1150, loss[loss=0.2022, simple_loss=0.2757, pruned_loss=0.06432, over 24460.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.2859, pruned_loss=0.07757, over 4697245.20 frames. ], batch size: 63, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:56:11,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 04:56:13,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:56:13,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:56:15,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:56:19,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.45 vs. limit=15.0 2023-09-29 04:56:19,194 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.82 vs. limit=6.0 2023-09-29 04:56:19,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:21,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:56:25,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:25,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:56:25,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 04:56:26,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=9.15 vs. limit=12.0 2023-09-29 04:56:26,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:29,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 04:56:32,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:32,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:56:36,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 04:56:38,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:40,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=255626.66666666666, ans=0.1 2023-09-29 04:56:40,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=255626.66666666666, ans=0.125 2023-09-29 04:56:41,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:43,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:56:43,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 04:56:43,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:56:44,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:48,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 04:56:49,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:51,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:57:03,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:09,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:09,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 04:57:11,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:11,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:12,647 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 2.096e+02 2.373e+02 2.802e+02 4.520e+02, threshold=4.746e+02, percent-clipped=0.0 2023-09-29 04:57:17,544 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 04:57:17,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=255826.66666666666, ans=0.125 2023-09-29 04:57:19,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:27,162 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 04:57:30,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:32,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:57:32,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:57:33,612 INFO [train.py:1039] (2/4) Epoch 8, batch 1200, loss[loss=0.2391, simple_loss=0.2934, pruned_loss=0.09239, over 23863.00 frames. ], tot_loss[loss=0.2214, simple_loss=0.2868, pruned_loss=0.07805, over 4706216.23 frames. ], batch size: 195, lr: 1.32e-02, grad_scale: 32.0 2023-09-29 04:57:33,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:57:37,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:37,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=255893.33333333334, ans=0.1 2023-09-29 04:57:42,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:57:43,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:57:45,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:57:45,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:45,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:57:46,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:57:48,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:57:49,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:49,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:51,443 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 04:57:56,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 04:58:00,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:58:03,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:58:05,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:07,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:07,035 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 04:58:09,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:11,757 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.90 vs. limit=15.0 2023-09-29 04:58:13,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=256026.66666666666, ans=0.0 2023-09-29 04:58:15,882 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.42 vs. limit=22.5 2023-09-29 04:58:18,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:58:18,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:58:18,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 04:58:20,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:58:23,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 04:58:24,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=256093.33333333334, ans=0.1 2023-09-29 04:58:26,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 04:58:26,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:28,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:58:29,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:29,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:58:31,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:31,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:58:33,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:58:33,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 04:58:33,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:58:35,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:35,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:58:36,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.30 vs. limit=15.0 2023-09-29 04:58:38,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:58:38,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:43,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:58:46,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:58:48,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 04:58:50,442 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 04:58:53,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:56,273 INFO [train.py:1039] (2/4) Epoch 8, batch 1250, loss[loss=0.3002, simple_loss=0.3331, pruned_loss=0.1337, over 19477.00 frames. ], tot_loss[loss=0.2232, simple_loss=0.2881, pruned_loss=0.07913, over 4707986.32 frames. ], batch size: 388, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 04:58:56,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:56,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:58:59,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:59:01,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 04:59:06,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:59:08,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:08,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 04:59:11,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:59:12,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:59:15,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:59:17,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:18,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:59:18,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:19,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=256293.33333333334, ans=0.125 2023-09-29 04:59:20,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:59:25,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:59:25,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:59:25,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:59:25,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:27,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:30,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:31,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:59:35,606 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.92 vs. limit=10.0 2023-09-29 04:59:36,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 04:59:36,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:59:40,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:59:41,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 04:59:43,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:43,059 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 04:59:43,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:43,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:46,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:50,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:51,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:59:52,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 04:59:52,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 04:59:52,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 04:59:55,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=256426.66666666666, ans=0.0 2023-09-29 04:59:56,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:59:58,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 04:59:58,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:01,788 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.911e+02 2.146e+02 2.412e+02 3.765e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 05:00:03,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:00:03,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:00:05,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 05:00:05,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:00:05,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:00:05,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:00:06,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:08,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 05:00:10,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:10,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=256493.33333333334, ans=0.125 2023-09-29 05:00:12,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:00:13,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:00:16,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:00:19,984 INFO [train.py:1039] (2/4) Epoch 8, batch 1300, loss[loss=0.2256, simple_loss=0.2882, pruned_loss=0.08151, over 23275.00 frames. ], tot_loss[loss=0.2227, simple_loss=0.288, pruned_loss=0.07874, over 4710315.69 frames. ], batch size: 119, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:00:21,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:21,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 05:00:25,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:26,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:00:26,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:00:28,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:31,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:00:33,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 05:00:36,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=256626.66666666666, ans=0.125 2023-09-29 05:00:39,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:00:39,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:00:42,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 05:00:44,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:00:49,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:50,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:51,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:53,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:54,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:00:55,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:00:55,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=256693.33333333334, ans=0.125 2023-09-29 05:00:56,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 05:00:56,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=256693.33333333334, ans=0.125 2023-09-29 05:01:02,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:01:02,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:01:04,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 05:01:06,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:01:09,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:01:09,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:01:10,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 05:01:10,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:12,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 05:01:13,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:17,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:01:17,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:01:22,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 05:01:22,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 05:01:23,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 05:01:27,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:01:30,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 05:01:34,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:42,324 INFO [train.py:1039] (2/4) Epoch 8, batch 1350, loss[loss=0.2234, simple_loss=0.2985, pruned_loss=0.0741, over 23965.00 frames. ], tot_loss[loss=0.2214, simple_loss=0.2874, pruned_loss=0.07776, over 4721397.46 frames. ], batch size: 80, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:01:42,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 05:01:46,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:48,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:01:51,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:51,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:53,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=256893.33333333334, ans=0.2 2023-09-29 05:01:54,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:01:54,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:01:58,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:02:00,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 05:02:02,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:03,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:02:06,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 05:02:06,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:02:07,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:02:07,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 05:02:10,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 05:02:12,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 05:02:14,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:14,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 05:02:21,187 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:02:27,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:37,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:37,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:37,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 05:02:41,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:42,321 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.11 vs. limit=15.0 2023-09-29 05:02:44,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 05:02:44,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:45,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:02:47,102 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.144e+02 2.487e+02 2.898e+02 4.537e+02, threshold=4.974e+02, percent-clipped=1.0 2023-09-29 05:02:48,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:02:50,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 05:02:53,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:02:58,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 05:03:00,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 05:03:04,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=257226.66666666666, ans=0.125 2023-09-29 05:03:05,412 INFO [train.py:1039] (2/4) Epoch 8, batch 1400, loss[loss=0.2069, simple_loss=0.267, pruned_loss=0.07345, over 23705.00 frames. ], tot_loss[loss=0.2204, simple_loss=0.2862, pruned_loss=0.07732, over 4716953.11 frames. ], batch size: 149, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:03:08,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 05:03:10,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:03:12,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:03:12,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=257226.66666666666, ans=0.1 2023-09-29 05:03:13,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:03:19,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 05:03:22,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 05:03:30,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:03:32,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:34,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:03:34,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:03:38,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:03:38,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 05:03:48,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:49,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.10 vs. limit=15.0 2023-09-29 05:03:50,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:54,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 05:03:54,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:03:56,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:03:56,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:03:57,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:59,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:03:59,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:04:01,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:04:01,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 05:04:02,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:04:04,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=257426.66666666666, ans=0.125 2023-09-29 05:04:06,399 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:04:06,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=257426.66666666666, ans=0.0 2023-09-29 05:04:07,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:11,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:04:13,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=257493.33333333334, ans=0.0 2023-09-29 05:04:19,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 05:04:20,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=257493.33333333334, ans=0.125 2023-09-29 05:04:21,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:04:21,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:04:21,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=257493.33333333334, ans=0.95 2023-09-29 05:04:24,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 05:04:25,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:28,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:04:29,534 INFO [train.py:1039] (2/4) Epoch 8, batch 1450, loss[loss=0.1966, simple_loss=0.2659, pruned_loss=0.0636, over 24572.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.285, pruned_loss=0.07679, over 4715840.23 frames. ], batch size: 60, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:04:31,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:04:34,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:04:34,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:34,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:04:39,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:41,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:04:42,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:04:42,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 05:04:44,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:04:46,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 05:04:46,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:48,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:48,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 05:04:50,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:04:50,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:04:51,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 05:04:52,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:54,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:04:55,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:59,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:02,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:05:02,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:05:05,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:05:05,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:10,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:05:10,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:13,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 05:05:15,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:05:20,620 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 05:05:22,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:22,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:05:25,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:27,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 05:05:30,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:32,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 05:05:33,289 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.099e+02 2.346e+02 2.740e+02 3.754e+02, threshold=4.692e+02, percent-clipped=0.0 2023-09-29 05:05:33,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 05:05:35,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:35,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=257826.66666666666, ans=0.125 2023-09-29 05:05:36,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:05:38,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:40,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 05:05:41,365 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.53 vs. limit=15.0 2023-09-29 05:05:42,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=257826.66666666666, ans=0.2 2023-09-29 05:05:44,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 05:05:44,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 05:05:46,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:48,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:05:51,400 INFO [train.py:1039] (2/4) Epoch 8, batch 1500, loss[loss=0.2077, simple_loss=0.2766, pruned_loss=0.06939, over 24316.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2851, pruned_loss=0.07635, over 4720549.90 frames. ], batch size: 56, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:05:55,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=257893.33333333334, ans=0.125 2023-09-29 05:05:57,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=257893.33333333334, ans=0.0 2023-09-29 05:05:58,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 05:06:00,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:06:00,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:06:01,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:02,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:03,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:06:05,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 05:06:05,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:06:06,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:06:06,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:08,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:06:09,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:06:11,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 05:06:16,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:06:16,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:06:16,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:18,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=257960.0, ans=10.0 2023-09-29 05:06:20,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=257960.0, ans=0.04949747468305833 2023-09-29 05:06:22,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 05:06:27,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 05:06:28,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:28,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 05:06:31,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:06:33,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:06:33,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:33,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:06:36,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 05:06:36,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:06:38,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:38,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 05:06:39,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:44,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:06:44,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 05:06:45,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.04 vs. limit=15.0 2023-09-29 05:06:52,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:06:53,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:06:59,034 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 05:07:00,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:00,502 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 05:07:02,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:02,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:02,271 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 05:07:02,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.99 vs. limit=15.0 2023-09-29 05:07:05,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:07:07,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 05:07:07,932 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:07:11,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:12,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:12,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:13,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=258226.66666666666, ans=0.125 2023-09-29 05:07:14,189 INFO [train.py:1039] (2/4) Epoch 8, batch 1550, loss[loss=0.1732, simple_loss=0.2454, pruned_loss=0.05043, over 24382.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.285, pruned_loss=0.07597, over 4722590.88 frames. ], batch size: 56, lr: 1.31e-02, grad_scale: 4.0 2023-09-29 05:07:14,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:14,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:14,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=258226.66666666666, ans=0.0 2023-09-29 05:07:15,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:07:17,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 05:07:18,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 05:07:18,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:07:20,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 05:07:20,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 05:07:22,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:23,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:23,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:25,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:07:27,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:27,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:29,093 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 05:07:29,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:30,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:07:30,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:07:32,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:07:32,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 05:07:34,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:36,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 05:07:36,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 05:07:36,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 05:07:36,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=258293.33333333334, ans=0.125 2023-09-29 05:07:38,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:39,056 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.38 vs. limit=22.5 2023-09-29 05:07:40,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:43,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:43,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 05:07:43,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 05:07:46,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=258360.0, ans=0.125 2023-09-29 05:07:53,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:57,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:57,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:07:58,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:07:58,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 05:08:04,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:08:06,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:10,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:08:13,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:08:15,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:08:15,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 05:08:15,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:18,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:08:18,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:19,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:08:19,657 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 05:08:20,934 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.986e+02 2.177e+02 2.778e+02 5.075e+02, threshold=4.355e+02, percent-clipped=1.0 2023-09-29 05:08:21,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:27,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 05:08:31,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:31,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:33,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 05:08:36,185 INFO [train.py:1039] (2/4) Epoch 8, batch 1600, loss[loss=0.2052, simple_loss=0.2896, pruned_loss=0.06043, over 24645.00 frames. ], tot_loss[loss=0.2195, simple_loss=0.2866, pruned_loss=0.0762, over 4731975.57 frames. ], batch size: 68, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:08:37,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:38,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:38,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:08:38,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:08:39,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:08:43,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:43,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 05:08:44,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=258560.0, ans=0.125 2023-09-29 05:08:45,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 05:08:46,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 05:08:50,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:08:52,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 05:08:53,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:08:55,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:08:59,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:09:02,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 05:09:06,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:09:07,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 05:09:07,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:07,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=258693.33333333334, ans=0.125 2023-09-29 05:09:09,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 05:09:14,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=258693.33333333334, ans=0.125 2023-09-29 05:09:15,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 05:09:24,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:24,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 05:09:26,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:26,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:09:26,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:09:29,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:09:33,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:09:34,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:09:36,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:09:39,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:09:40,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:09:41,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:09:49,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:49,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:09:52,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 05:09:52,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:09:53,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 05:09:57,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=258893.33333333334, ans=0.125 2023-09-29 05:09:58,308 INFO [train.py:1039] (2/4) Epoch 8, batch 1650, loss[loss=0.2207, simple_loss=0.2733, pruned_loss=0.08408, over 23569.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2871, pruned_loss=0.077, over 4725028.23 frames. ], batch size: 256, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:09:59,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:01,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:01,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:10:01,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 05:10:01,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 05:10:01,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 05:10:03,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 05:10:06,909 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.20 vs. limit=10.0 2023-09-29 05:10:07,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:10:07,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:07,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:07,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:10:10,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:13,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 05:10:16,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:10:16,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:16,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:10:16,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:10:16,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 05:10:16,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 05:10:19,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=258960.0, ans=0.125 2023-09-29 05:10:22,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:10:25,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:10:35,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 05:10:35,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:37,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 05:10:40,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=259026.66666666666, ans=0.125 2023-09-29 05:10:40,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=259026.66666666666, ans=0.125 2023-09-29 05:10:41,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:10:43,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:10:43,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:10:44,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:10:46,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:47,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:50,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:51,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:10:53,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:54,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:10:58,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:59,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 05:11:01,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:11:02,713 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.991e+02 2.189e+02 2.754e+02 4.240e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 05:11:02,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 05:11:02,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 05:11:03,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 05:11:03,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:04,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:11:04,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:04,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=259160.0, ans=0.2 2023-09-29 05:11:06,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:11:06,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 05:11:09,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:11,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:11:11,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:12,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=259160.0, ans=0.0 2023-09-29 05:11:14,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 05:11:17,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=259226.66666666666, ans=0.125 2023-09-29 05:11:18,718 INFO [train.py:1039] (2/4) Epoch 8, batch 1700, loss[loss=0.2285, simple_loss=0.2616, pruned_loss=0.09769, over 19496.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.287, pruned_loss=0.07717, over 4719216.10 frames. ], batch size: 388, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:11:18,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:18,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:11:18,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 05:11:19,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:19,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:11:19,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:20,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=259226.66666666666, ans=0.09899494936611666 2023-09-29 05:11:24,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=259226.66666666666, ans=0.0 2023-09-29 05:11:25,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:11:25,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:11:25,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 05:11:28,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:11:37,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:39,306 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.76 vs. limit=15.0 2023-09-29 05:11:40,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:11:47,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:11:47,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:11:48,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:48,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:11:51,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 05:11:53,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:11:53,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:54,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:11:55,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=259360.0, ans=0.125 2023-09-29 05:11:56,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:11:57,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 05:11:58,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 05:12:00,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:02,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 05:12:03,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:12:12,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:14,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:15,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:12:15,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:12:15,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 05:12:17,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:12:18,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:18,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 05:12:18,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:12:18,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:20,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:20,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:22,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:22,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:12:24,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:25,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:12:25,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:29,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:29,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 05:12:33,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:35,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=259493.33333333334, ans=0.0 2023-09-29 05:12:38,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 05:12:41,286 INFO [train.py:1039] (2/4) Epoch 8, batch 1750, loss[loss=0.2193, simple_loss=0.2682, pruned_loss=0.08521, over 22548.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.286, pruned_loss=0.07629, over 4734693.91 frames. ], batch size: 322, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:12:43,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:45,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:45,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:12:47,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 05:12:47,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:51,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:12:51,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:54,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 05:12:57,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:00,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 05:13:00,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:02,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:13:05,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:13:05,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 05:13:06,823 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.46 vs. limit=22.5 2023-09-29 05:13:08,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:13:08,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 05:13:17,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=259693.33333333334, ans=0.025 2023-09-29 05:13:17,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.01 vs. limit=10.0 2023-09-29 05:13:19,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:13:22,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:13:22,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:27,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:27,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:29,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:13:29,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=259760.0, ans=0.0 2023-09-29 05:13:30,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:33,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:33,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:35,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 05:13:35,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=259760.0, ans=0.125 2023-09-29 05:13:36,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:39,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 05:13:39,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:41,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:42,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:13:47,671 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.975e+02 2.294e+02 2.712e+02 4.778e+02, threshold=4.588e+02, percent-clipped=2.0 2023-09-29 05:13:47,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:13:47,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 05:13:49,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:49,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:54,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=259826.66666666666, ans=0.125 2023-09-29 05:13:56,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:58,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:00,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:14:02,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 05:14:02,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:03,470 INFO [train.py:1039] (2/4) Epoch 8, batch 1800, loss[loss=0.232, simple_loss=0.2909, pruned_loss=0.08657, over 23723.00 frames. ], tot_loss[loss=0.2187, simple_loss=0.2858, pruned_loss=0.07579, over 4728238.70 frames. ], batch size: 150, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:14:03,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:14:03,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:03,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:14:03,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:14:05,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:14:08,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:14:09,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:14:11,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:14:14,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:17,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:14:19,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:14:23,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:25,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:26,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:28,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:14:30,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:30,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 05:14:31,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:34,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:38,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 05:14:41,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 05:14:41,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 05:14:41,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:41,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:41,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:14:42,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:14:48,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.26 vs. limit=15.0 2023-09-29 05:14:49,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=260026.66666666666, ans=0.04949747468305833 2023-09-29 05:14:50,435 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 05:14:50,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:14:52,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:55,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 05:14:55,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 05:14:57,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:14:59,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:15:00,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:15:06,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 05:15:12,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:15:12,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 05:15:12,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=260160.0, ans=0.2 2023-09-29 05:15:14,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:15:14,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:14,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:15:14,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 05:15:17,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:15:17,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:20,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 05:15:20,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:20,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=260160.0, ans=0.125 2023-09-29 05:15:21,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:21,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:15:21,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:15:25,465 INFO [train.py:1039] (2/4) Epoch 8, batch 1850, loss[loss=0.239, simple_loss=0.2953, pruned_loss=0.09139, over 23808.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.2857, pruned_loss=0.0755, over 4731796.07 frames. ], batch size: 179, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:15:27,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:15:27,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:28,702 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.38 vs. limit=15.0 2023-09-29 05:15:31,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:15:32,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:15:39,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=260226.66666666666, ans=0.0 2023-09-29 05:15:42,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:15:42,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 05:15:44,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=260293.33333333334, ans=0.0 2023-09-29 05:15:45,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 05:15:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 05:15:49,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.49 vs. limit=22.5 2023-09-29 05:15:52,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:52,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 05:15:52,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 05:16:02,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:16:03,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 05:16:07,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:07,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:11,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=260360.0, ans=0.2 2023-09-29 05:16:12,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 05:16:14,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:14,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:16:15,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:16:18,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:16:19,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=260426.66666666666, ans=0.5 2023-09-29 05:16:21,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:16:24,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:16:24,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:24,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:16:26,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:16:27,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:29,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:16:30,631 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.959e+02 2.142e+02 2.407e+02 4.178e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 05:16:33,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 05:16:35,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:38,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:16:39,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:16:39,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 05:16:39,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 05:16:41,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=260493.33333333334, ans=0.0 2023-09-29 05:16:43,452 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 05:16:45,000 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 05:16:45,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:16:46,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:46,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:16:46,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:47,975 INFO [train.py:1039] (2/4) Epoch 8, batch 1900, loss[loss=0.1794, simple_loss=0.2504, pruned_loss=0.05417, over 24457.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2861, pruned_loss=0.07581, over 4722757.95 frames. ], batch size: 58, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:16:48,077 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 05:16:48,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:16:48,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:48,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=260560.0, ans=0.95 2023-09-29 05:16:49,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:16:51,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:16:52,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:52,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 05:16:54,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:54,278 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 05:16:54,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:16:55,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:16:59,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=260560.0, ans=0.1 2023-09-29 05:17:00,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:17:03,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:17:03,816 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 05:17:05,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 05:17:05,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:17:06,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:17:06,896 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 05:17:08,334 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 05:17:11,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 05:17:13,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:17:19,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 05:17:22,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 05:17:22,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=260693.33333333334, ans=0.1 2023-09-29 05:17:31,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 05:17:33,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 05:17:33,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:17:33,462 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 05:17:33,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 05:17:33,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 05:17:33,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 05:17:33,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:17:38,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 05:17:41,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:17:44,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:17:44,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 05:17:46,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:17:51,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 05:17:51,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:17:57,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:17:57,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:17:57,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:17:59,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:17:59,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=260826.66666666666, ans=0.125 2023-09-29 05:18:00,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:18:00,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:18:02,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:18:05,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:05,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:05,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=260826.66666666666, ans=0.0 2023-09-29 05:18:08,563 INFO [train.py:1039] (2/4) Epoch 8, batch 1950, loss[loss=0.2337, simple_loss=0.2998, pruned_loss=0.08382, over 23535.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2873, pruned_loss=0.07699, over 4714644.75 frames. ], batch size: 106, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:18:08,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:18:08,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:08,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:18:11,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:13,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:16,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:18:16,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:16,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:18:19,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 05:18:19,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:18:21,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:23,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:27,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:18:27,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:27,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:28,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:18:30,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:30,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:18:30,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:18:31,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:32,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=260960.0, ans=0.0 2023-09-29 05:18:35,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:38,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:38,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:38,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:18:38,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 05:18:38,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:18:39,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:18:39,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:44,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:46,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=261026.66666666666, ans=0.2 2023-09-29 05:18:47,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:51,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:18:55,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:18:55,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:18:55,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 05:18:56,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:19:00,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:19:01,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:19:02,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:12,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:12,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=261160.0, ans=0.0 2023-09-29 05:19:14,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:15,359 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.025e+02 2.337e+02 2.726e+02 4.544e+02, threshold=4.674e+02, percent-clipped=3.0 2023-09-29 05:19:17,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:19,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:20,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=261160.0, ans=0.0 2023-09-29 05:19:21,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:19:23,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:25,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 05:19:25,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:19:26,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:19:26,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 05:19:30,278 INFO [train.py:1039] (2/4) Epoch 8, batch 2000, loss[loss=0.2217, simple_loss=0.295, pruned_loss=0.07416, over 24593.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2875, pruned_loss=0.07688, over 4714417.73 frames. ], batch size: 71, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:19:30,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:35,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:35,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:19:37,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:19:39,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:19:40,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:40,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=261226.66666666666, ans=0.1 2023-09-29 05:19:43,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 05:19:43,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:19:46,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:19:47,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=261293.33333333334, ans=0.1 2023-09-29 05:19:49,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 05:19:49,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:19:49,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:53,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.86 vs. limit=22.5 2023-09-29 05:19:54,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:19:56,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 05:19:57,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:59,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:59,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:01,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 05:20:01,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:20:03,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 05:20:03,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:05,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=261360.0, ans=0.0 2023-09-29 05:20:06,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:08,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:20:08,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:08,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:10,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:12,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 05:20:15,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 05:20:15,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:15,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:20,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:21,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:20:21,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:22,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=261426.66666666666, ans=0.125 2023-09-29 05:20:23,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:20:26,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:26,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:26,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:27,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:29,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:30,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:33,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 05:20:39,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:20:39,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:43,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:43,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:20:48,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:49,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:49,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:49,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=261493.33333333334, ans=0.0 2023-09-29 05:20:51,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:20:52,726 INFO [train.py:1039] (2/4) Epoch 8, batch 2050, loss[loss=0.2151, simple_loss=0.2617, pruned_loss=0.08426, over 19597.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2873, pruned_loss=0.07696, over 4715051.89 frames. ], batch size: 388, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:20:52,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:20:54,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:56,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=261560.0, ans=0.1 2023-09-29 05:20:56,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.98 vs. limit=15.0 2023-09-29 05:20:57,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:57,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:57,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=261560.0, ans=0.1 2023-09-29 05:21:05,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:21:07,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:21:07,312 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:21:08,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:21:08,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:21:11,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 05:21:11,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:21:11,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:11,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:21:24,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:24,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:27,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 05:21:28,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:29,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 05:21:30,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:33,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:35,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:37,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:21:37,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:38,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:21:38,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:21:40,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:21:43,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:46,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:21:47,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=261760.0, ans=0.125 2023-09-29 05:21:48,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:21:50,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:54,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:21:55,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=261760.0, ans=0.125 2023-09-29 05:21:59,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:22:01,287 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.119e+02 2.372e+02 3.018e+02 5.017e+02, threshold=4.745e+02, percent-clipped=2.0 2023-09-29 05:22:01,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 05:22:06,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:07,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:22:08,600 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=8.48 vs. limit=15.0 2023-09-29 05:22:09,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:22:09,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 05:22:14,311 INFO [train.py:1039] (2/4) Epoch 8, batch 2100, loss[loss=0.2271, simple_loss=0.2823, pruned_loss=0.08597, over 23588.00 frames. ], tot_loss[loss=0.2194, simple_loss=0.2851, pruned_loss=0.07683, over 4699751.37 frames. ], batch size: 135, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:22:14,495 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 05:22:14,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:14,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:16,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:18,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:18,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 05:22:18,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 05:22:20,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:22:23,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:22:23,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:22:26,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:28,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:22:28,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 05:22:30,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:22:30,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 05:22:30,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 05:22:33,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:22:33,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:22:33,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 05:22:33,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 05:22:39,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 05:22:39,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:40,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:22:41,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:42,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=261960.0, ans=0.125 2023-09-29 05:22:45,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:22:47,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 05:22:47,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:22:47,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 05:22:51,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 05:22:52,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=262026.66666666666, ans=0.1 2023-09-29 05:22:53,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:53,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 05:22:53,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 05:22:53,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=262026.66666666666, ans=0.125 2023-09-29 05:22:54,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 05:22:56,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:22:57,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:23:01,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:02,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:04,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:06,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:06,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 05:23:06,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:06,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:07,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:07,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 05:23:08,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=262093.33333333334, ans=0.1 2023-09-29 05:23:09,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 05:23:10,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 05:23:14,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:23:17,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:23:17,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 05:23:25,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:29,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:23:29,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:23:29,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:23:29,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:23:31,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:23:32,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:32,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:23:32,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:23:34,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:35,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 05:23:37,098 INFO [train.py:1039] (2/4) Epoch 8, batch 2150, loss[loss=0.1954, simple_loss=0.2792, pruned_loss=0.05577, over 24565.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2838, pruned_loss=0.07623, over 4694423.42 frames. ], batch size: 71, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:23:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 05:23:37,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:40,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:40,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:23:40,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:23:40,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:23:44,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=262226.6666666667, ans=0.2 2023-09-29 05:23:45,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=262226.6666666667, ans=0.1 2023-09-29 05:23:47,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:23:50,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:51,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:23:54,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:23:54,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:23:59,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:59,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:23:59,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:24:03,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:03,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 05:24:04,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=262293.3333333333, ans=0.0 2023-09-29 05:24:09,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:09,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:24:11,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:11,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:12,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:13,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:24:13,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:13,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:24:14,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:24:16,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 05:24:17,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:24:19,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:19,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:20,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:24:21,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:24:24,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:24,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:24:26,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:26,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 05:24:26,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:24:31,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:32,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:33,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=262426.6666666667, ans=0.125 2023-09-29 05:24:34,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:34,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:24:35,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:37,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:37,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 05:24:40,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 05:24:40,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:24:41,449 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 05:24:41,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:41,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:24:42,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 05:24:43,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:24:43,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 05:24:43,046 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 05:24:43,047 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 05:24:44,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 05:24:45,903 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.226e+02 2.643e+02 3.151e+02 6.561e+02, threshold=5.285e+02, percent-clipped=6.0 2023-09-29 05:24:46,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:46,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:46,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:24:47,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:47,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:24:47,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:47,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:54,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=262493.3333333333, ans=0.125 2023-09-29 05:24:57,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:24:58,881 INFO [train.py:1039] (2/4) Epoch 8, batch 2200, loss[loss=0.2377, simple_loss=0.2929, pruned_loss=0.09126, over 23808.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2847, pruned_loss=0.07628, over 4707958.76 frames. ], batch size: 212, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:24:59,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 05:25:04,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:25:08,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:10,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:25:11,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:11,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:25:14,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:25:16,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:25:16,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 05:25:22,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 05:25:23,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:25:28,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 05:25:32,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:33,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:25:33,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:25:35,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=262693.3333333333, ans=0.0 2023-09-29 05:25:37,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:25:37,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 05:25:39,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=262693.3333333333, ans=0.035 2023-09-29 05:25:41,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=262693.3333333333, ans=0.0 2023-09-29 05:25:42,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:25:44,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:44,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 05:25:48,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:25:49,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:25:51,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:25:52,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:55,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 05:25:57,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:25:58,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 05:26:01,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:01,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:26:01,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:03,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=262826.6666666667, ans=0.0 2023-09-29 05:26:04,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:26:04,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:04,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:04,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:06,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:26:06,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:26:08,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:26:13,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:26:14,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:16,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:26:16,661 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 05:26:20,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:26:20,874 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 05:26:22,240 INFO [train.py:1039] (2/4) Epoch 8, batch 2250, loss[loss=0.2201, simple_loss=0.2797, pruned_loss=0.08025, over 23620.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.285, pruned_loss=0.07606, over 4697666.01 frames. ], batch size: 149, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:26:22,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:26:23,963 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 05:26:25,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:25,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:26:27,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:28,704 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 05:26:30,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:26:32,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:38,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:26:38,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:26:39,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.66 vs. limit=15.0 2023-09-29 05:26:40,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:42,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:42,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:45,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 05:26:45,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:47,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:26:48,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 05:26:49,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=262960.0, ans=0.0 2023-09-29 05:26:51,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:51,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:52,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:56,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=263026.6666666667, ans=0.05 2023-09-29 05:26:57,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:58,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:26:58,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:27:00,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 05:27:01,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:27:03,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:27:06,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:08,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:09,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:27:09,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:27:11,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:27:14,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:27:18,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:27:19,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.00 vs. limit=22.5 2023-09-29 05:27:21,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:27:27,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:27:27,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:27:28,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:27:31,594 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.954e+02 2.186e+02 2.448e+02 4.409e+02, threshold=4.373e+02, percent-clipped=0.0 2023-09-29 05:27:36,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:27:39,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:27:39,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 05:27:39,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:39,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:27:43,932 INFO [train.py:1039] (2/4) Epoch 8, batch 2300, loss[loss=0.1986, simple_loss=0.27, pruned_loss=0.06359, over 24453.00 frames. ], tot_loss[loss=0.2197, simple_loss=0.2865, pruned_loss=0.07651, over 4714561.80 frames. ], batch size: 63, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:27:43,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 05:27:45,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:27:45,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:52,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:52,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:27:54,143 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 05:27:55,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:27:56,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=263226.6666666667, ans=0.1 2023-09-29 05:28:03,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:28:03,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:28:04,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:04,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:04,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 05:28:06,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:28:07,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:07,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:28:08,767 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.18 vs. limit=15.0 2023-09-29 05:28:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:28:14,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:28:17,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:22,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:28:22,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:25,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:28:29,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:28:29,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=263360.0, ans=0.0 2023-09-29 05:28:33,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:34,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:28:34,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:28:34,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 05:28:37,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:28:37,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:37,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:37,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:28:39,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:40,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:28:40,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:28:42,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 05:28:42,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:28:42,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:42,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 05:28:51,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:28:53,516 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.59 vs. limit=15.0 2023-09-29 05:28:55,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:28:55,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=263493.3333333333, ans=0.2 2023-09-29 05:28:58,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:58,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:28:58,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:29:01,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:29:01,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:03,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:29:03,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 05:29:07,056 INFO [train.py:1039] (2/4) Epoch 8, batch 2350, loss[loss=0.2932, simple_loss=0.3306, pruned_loss=0.1279, over 19877.00 frames. ], tot_loss[loss=0.2209, simple_loss=0.2875, pruned_loss=0.07719, over 4717589.03 frames. ], batch size: 388, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:29:10,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:29:10,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 05:29:16,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 05:29:18,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:29:22,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:23,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:25,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 05:29:27,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=263626.6666666667, ans=0.125 2023-09-29 05:29:29,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:29:35,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 05:29:37,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:37,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=263693.3333333333, ans=0.0 2023-09-29 05:29:39,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:29:39,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:43,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:29:46,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 05:29:47,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:29:49,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:50,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:29:50,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:29:55,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:29:57,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=263760.0, ans=0.1 2023-09-29 05:29:58,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 05:29:58,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:30:00,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=263760.0, ans=0.1 2023-09-29 05:30:01,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:30:01,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:30:03,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 05:30:03,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:30:06,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 05:30:08,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:30:11,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 05:30:14,766 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.176e+02 2.529e+02 2.945e+02 4.428e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 05:30:17,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 05:30:17,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:30:17,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 05:30:17,806 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 05:30:19,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 05:30:19,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=263826.6666666667, ans=0.125 2023-09-29 05:30:20,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 05:30:22,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.21 vs. limit=12.0 2023-09-29 05:30:25,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:30:28,202 INFO [train.py:1039] (2/4) Epoch 8, batch 2400, loss[loss=0.2286, simple_loss=0.2796, pruned_loss=0.08882, over 23580.00 frames. ], tot_loss[loss=0.2209, simple_loss=0.2874, pruned_loss=0.07716, over 4717124.41 frames. ], batch size: 256, lr: 1.30e-02, grad_scale: 16.0 2023-09-29 05:30:28,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:30:28,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=263893.3333333333, ans=0.125 2023-09-29 05:30:32,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:30:34,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:30:34,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=263893.3333333333, ans=0.125 2023-09-29 05:30:35,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 05:30:35,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 05:30:43,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:30:43,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:30:46,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 05:30:46,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:30:47,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:49,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 05:30:56,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:58,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 05:31:02,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:31:07,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 05:31:09,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:09,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=264026.6666666667, ans=0.0 2023-09-29 05:31:10,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:16,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:18,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 05:31:18,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:31:24,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:28,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:31:31,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:32,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:31:32,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:31:33,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:31:33,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:33,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:31:33,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:31:36,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=264160.0, ans=0.125 2023-09-29 05:31:37,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:31:37,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:31:39,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 05:31:39,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 05:31:40,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:40,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:41,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 05:31:41,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 05:31:41,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 05:31:41,173 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 05:31:44,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 05:31:44,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:45,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:45,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:47,811 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 05:31:49,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:49,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:31:51,826 INFO [train.py:1039] (2/4) Epoch 8, batch 2450, loss[loss=0.2192, simple_loss=0.295, pruned_loss=0.07176, over 24017.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2846, pruned_loss=0.07631, over 4705652.71 frames. ], batch size: 80, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:31:55,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:31:55,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:59,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:59,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:01,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 05:32:06,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:06,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:06,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=264293.3333333333, ans=0.125 2023-09-29 05:32:07,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=264293.3333333333, ans=0.04949747468305833 2023-09-29 05:32:09,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:32:10,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:32:10,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:32:10,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 05:32:13,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:16,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:32:17,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:32:21,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:32:23,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:24,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.02 vs. limit=15.0 2023-09-29 05:32:25,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:25,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:32:27,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 05:32:27,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:32:35,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:37,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:37,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:38,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:32:39,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:39,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:32:40,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 05:32:43,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:43,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:32:48,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:32:48,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:53,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:32:53,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 05:32:54,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:32:54,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:54,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 05:32:56,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:32:57,624 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=15.0 2023-09-29 05:32:58,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:33:01,895 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.058e+02 2.397e+02 2.730e+02 4.175e+02, threshold=4.793e+02, percent-clipped=0.0 2023-09-29 05:33:05,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:33:05,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=264493.3333333333, ans=0.1 2023-09-29 05:33:08,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:08,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:33:11,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 05:33:11,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:33:13,065 INFO [train.py:1039] (2/4) Epoch 8, batch 2500, loss[loss=0.2304, simple_loss=0.2889, pruned_loss=0.08593, over 23737.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.2837, pruned_loss=0.0759, over 4710992.74 frames. ], batch size: 179, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:33:19,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:24,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=264560.0, ans=0.125 2023-09-29 05:33:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:33:28,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:33:31,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:31,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 05:33:33,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=264626.6666666667, ans=0.09899494936611666 2023-09-29 05:33:37,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:33:39,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:33:40,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:33:40,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:33:42,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 05:33:43,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:43,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:45,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 05:33:45,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:45,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 05:33:45,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:33:50,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:50,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=264693.3333333333, ans=0.125 2023-09-29 05:33:51,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:54,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:33:54,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 05:33:54,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:33:57,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:34:02,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:05,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:11,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:16,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:34:21,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 05:34:21,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:21,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:34:22,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:34:22,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:34:24,206 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 05:34:24,207 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 05:34:24,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 05:34:27,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:34:30,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 05:34:30,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 05:34:31,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:33,171 INFO [train.py:1039] (2/4) Epoch 8, batch 2550, loss[loss=0.2251, simple_loss=0.2832, pruned_loss=0.08344, over 23245.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2844, pruned_loss=0.0759, over 4716084.86 frames. ], batch size: 119, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:34:33,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 05:34:36,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 05:34:36,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=264893.3333333333, ans=0.1 2023-09-29 05:34:38,594 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.46 vs. limit=12.0 2023-09-29 05:34:39,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:41,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:34:42,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:34:44,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:47,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 05:34:47,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:34:51,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 05:34:53,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:34:54,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:57,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:57,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 05:34:57,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:34:57,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:34:59,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:01,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:35:02,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 05:35:02,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:35:02,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:02,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 05:35:08,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=265026.6666666667, ans=0.0 2023-09-29 05:35:10,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=265026.6666666667, ans=0.1 2023-09-29 05:35:16,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:35:20,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:20,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:20,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:35:22,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:35:29,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:32,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:35:32,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:35:32,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:35:33,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:35:33,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:35:35,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:36,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:40,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:35:40,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 05:35:40,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:35:41,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:42,420 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.55 vs. limit=15.0 2023-09-29 05:35:43,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:35:45,163 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.988e+02 2.217e+02 2.595e+02 3.453e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 05:35:45,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:35:46,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:35:54,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:35:54,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.63 vs. limit=10.0 2023-09-29 05:35:55,493 INFO [train.py:1039] (2/4) Epoch 8, batch 2600, loss[loss=0.2053, simple_loss=0.288, pruned_loss=0.0613, over 24304.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2848, pruned_loss=0.07578, over 4712839.20 frames. ], batch size: 74, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:35:57,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:00,684 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 05:36:02,308 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 05:36:02,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:36:03,787 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 05:36:03,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 05:36:03,937 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 05:36:06,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:36:07,010 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 05:36:08,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 05:36:08,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=265226.6666666667, ans=0.1 2023-09-29 05:36:09,950 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 05:36:12,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:36:14,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 05:36:16,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 05:36:17,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:36:17,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 05:36:21,526 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 05:36:21,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 05:36:29,177 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.31 vs. limit=15.0 2023-09-29 05:36:31,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:31,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:31,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:31,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 05:36:33,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:36:39,668 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 05:36:44,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:44,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:45,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 05:36:45,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:36:45,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:47,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 05:36:50,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:36:50,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:36:54,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:55,935 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 05:36:55,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:56,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:37:03,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:37:04,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:37:04,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 05:37:06,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:37:08,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:09,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:15,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 05:37:16,987 INFO [train.py:1039] (2/4) Epoch 8, batch 2650, loss[loss=0.2064, simple_loss=0.2817, pruned_loss=0.06558, over 24651.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2854, pruned_loss=0.07625, over 4716551.95 frames. ], batch size: 65, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:37:17,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:18,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:37:22,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 05:37:23,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:25,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:37:26,473 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 05:37:26,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:37:28,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:31,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:37:33,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:35,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:37:36,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 05:37:36,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:37:36,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:37:41,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 05:37:41,721 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 05:37:45,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:48,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 05:37:48,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:37:48,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 05:37:51,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:53,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:37:53,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:54,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:37:57,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 05:37:57,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 05:38:01,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:05,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 05:38:05,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:06,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:06,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:06,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:08,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:10,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:10,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:11,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:38:13,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:38:14,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:38:17,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:19,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:38:19,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:21,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:21,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:38:26,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:28,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:38:28,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:29,357 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.037e+02 2.259e+02 2.622e+02 3.986e+02, threshold=4.518e+02, percent-clipped=0.0 2023-09-29 05:38:29,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 05:38:29,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=265826.6666666667, ans=0.0 2023-09-29 05:38:33,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:34,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:34,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=265826.6666666667, ans=0.125 2023-09-29 05:38:35,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:35,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:37,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:39,334 INFO [train.py:1039] (2/4) Epoch 8, batch 2700, loss[loss=0.2285, simple_loss=0.2894, pruned_loss=0.08384, over 23755.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2867, pruned_loss=0.07722, over 4716525.37 frames. ], batch size: 164, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:38:39,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:41,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:38:41,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 05:38:45,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:38:46,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 05:38:48,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:48,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:48,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:50,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:38:50,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:51,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:38:51,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:38:51,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 05:38:51,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:38:54,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:54,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:38:56,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:39:01,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:39:01,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 05:39:03,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:09,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:39:09,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:15,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:39:15,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:39:15,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:39:15,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:39:20,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:23,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:23,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:39:23,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:39:29,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:29,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:39:36,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:39:38,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:39:38,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=266093.3333333333, ans=0.0 2023-09-29 05:39:41,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:39:41,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:44,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:45,805 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.28 vs. limit=15.0 2023-09-29 05:39:46,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:46,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:48,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:39:48,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:50,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:39:52,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:53,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:53,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:56,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 05:39:56,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:59,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:39:59,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 05:40:01,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 05:40:02,918 INFO [train.py:1039] (2/4) Epoch 8, batch 2750, loss[loss=0.1787, simple_loss=0.2542, pruned_loss=0.05153, over 21220.00 frames. ], tot_loss[loss=0.221, simple_loss=0.2866, pruned_loss=0.07776, over 4702717.60 frames. ], batch size: 46, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:40:03,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:04,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:04,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:08,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:08,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:40:08,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:12,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:13,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:40:13,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:40:13,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:13,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 05:40:13,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:40:13,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:16,004 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=10.65 vs. limit=12.0 2023-09-29 05:40:22,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 05:40:24,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:40:24,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=266293.3333333333, ans=0.0 2023-09-29 05:40:25,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:25,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:40:25,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:40:27,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:28,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:40:29,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:30,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:33,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:40:33,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=266293.3333333333, ans=0.125 2023-09-29 05:40:35,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:40:36,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:40:36,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:36,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=266360.0, ans=0.125 2023-09-29 05:40:38,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:40:41,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=266360.0, ans=0.2 2023-09-29 05:40:45,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:48,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:40:49,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:51,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:51,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:40:53,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:40:59,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:41:00,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:41:00,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 05:41:04,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:06,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 05:41:06,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=266426.6666666667, ans=0.0 2023-09-29 05:41:09,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=266493.3333333333, ans=0.125 2023-09-29 05:41:11,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:41:13,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:41:14,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 05:41:14,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:41:15,922 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 2.077e+02 2.428e+02 2.871e+02 4.392e+02, threshold=4.857e+02, percent-clipped=0.0 2023-09-29 05:41:17,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:41:17,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 05:41:17,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:41:21,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:41:21,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:21,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:41:22,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 05:41:22,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:23,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:25,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:25,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 05:41:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 05:41:26,533 INFO [train.py:1039] (2/4) Epoch 8, batch 2800, loss[loss=0.2094, simple_loss=0.2646, pruned_loss=0.07712, over 23598.00 frames. ], tot_loss[loss=0.2198, simple_loss=0.2847, pruned_loss=0.07747, over 4695097.23 frames. ], batch size: 256, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:41:31,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:33,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:41:33,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:41:34,139 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-09-29 05:41:37,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:41:39,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 05:41:41,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=266626.6666666667, ans=10.0 2023-09-29 05:41:42,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:41:42,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=266626.6666666667, ans=0.2 2023-09-29 05:41:43,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 05:41:45,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:46,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:41:46,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:41:50,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:41:50,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:50,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:41:55,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:42:04,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:42:06,802 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.44 vs. limit=22.5 2023-09-29 05:42:07,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:09,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:11,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:42:11,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:16,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=266693.3333333333, ans=0.125 2023-09-29 05:42:17,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:17,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 05:42:17,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:19,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:19,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:42:20,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.02 vs. limit=15.0 2023-09-29 05:42:23,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:25,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:27,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:30,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:42:30,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:30,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:42:32,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:42:33,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:42:34,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:42:34,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 05:42:34,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:36,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:42:36,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:38,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 05:42:38,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:38,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:42:39,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:42:41,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 05:42:47,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=266826.6666666667, ans=0.125 2023-09-29 05:42:47,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=266826.6666666667, ans=0.125 2023-09-29 05:42:48,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:48,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:42:49,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:42:51,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:42:52,724 INFO [train.py:1039] (2/4) Epoch 8, batch 2850, loss[loss=0.1741, simple_loss=0.2508, pruned_loss=0.04871, over 24584.00 frames. ], tot_loss[loss=0.2179, simple_loss=0.2832, pruned_loss=0.07636, over 4683298.12 frames. ], batch size: 60, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:42:56,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=266893.3333333333, ans=0.125 2023-09-29 05:42:57,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:42:57,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:57,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:58,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:00,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:43:02,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:43:04,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 05:43:11,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 05:43:11,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:13,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 05:43:13,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:13,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=266960.0, ans=0.125 2023-09-29 05:43:16,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 05:43:16,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 05:43:20,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:32,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:33,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:35,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:43:35,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:43:36,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:43:37,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:43:40,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:43:40,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 05:43:42,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:43:42,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:43:43,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=267093.3333333333, ans=0.0 2023-09-29 05:43:44,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:44,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:46,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:46,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:48,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:49,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:52,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:43:52,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:54,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:57,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:44:02,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:44:04,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 05:44:04,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 05:44:05,428 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.097e+02 2.299e+02 2.689e+02 7.485e+02, threshold=4.599e+02, percent-clipped=2.0 2023-09-29 05:44:07,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:44:07,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:07,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 05:44:08,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:44:10,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:10,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:10,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:44:10,232 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 05:44:10,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=267160.0, ans=0.1 2023-09-29 05:44:11,581 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 05:44:11,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:13,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:15,346 INFO [train.py:1039] (2/4) Epoch 8, batch 2900, loss[loss=0.2343, simple_loss=0.2894, pruned_loss=0.08962, over 22819.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2834, pruned_loss=0.07578, over 4698412.11 frames. ], batch size: 322, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:44:15,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:15,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=267226.6666666667, ans=0.125 2023-09-29 05:44:17,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:17,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:44:19,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 05:44:23,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:23,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 05:44:24,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 05:44:25,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:44:25,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:44:26,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.95 vs. limit=10.0 2023-09-29 05:44:27,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:29,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:44:33,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:33,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:37,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:44:38,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 05:44:38,869 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.68 vs. limit=12.0 2023-09-29 05:44:39,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:44:41,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:42,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 05:44:44,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 05:44:45,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:45,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 05:44:45,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:44:49,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:44:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:52,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:56,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:56,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=267360.0, ans=0.125 2023-09-29 05:44:58,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=267360.0, ans=0.125 2023-09-29 05:44:59,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:45:03,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:04,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 05:45:04,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 05:45:04,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:45:09,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:45:12,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 05:45:13,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:45:20,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:45:28,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:45:28,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:45:29,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=267493.3333333333, ans=0.125 2023-09-29 05:45:30,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 05:45:33,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:33,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 05:45:34,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:34,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:45:36,142 INFO [train.py:1039] (2/4) Epoch 8, batch 2950, loss[loss=0.2094, simple_loss=0.2872, pruned_loss=0.0658, over 24433.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2853, pruned_loss=0.07624, over 4709823.35 frames. ], batch size: 63, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:45:41,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:43,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 05:45:43,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:44,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:45,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=267560.0, ans=0.125 2023-09-29 05:45:46,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:45:48,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:45:49,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 05:45:51,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 05:45:51,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:45:51,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:54,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=267626.6666666667, ans=0.125 2023-09-29 05:45:56,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:45:59,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:01,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:01,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:04,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:04,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:46:06,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:06,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=267626.6666666667, ans=0.125 2023-09-29 05:46:08,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:08,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:46:09,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 05:46:16,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 05:46:18,104 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 05:46:19,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:46:21,168 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 05:46:22,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 05:46:22,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:24,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:46:24,265 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 05:46:24,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:46:27,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 05:46:28,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:30,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:46:33,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:35,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:46:35,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:35,134 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 05:46:36,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:36,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 05:46:43,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:45,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:46:45,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 05:46:45,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:46:47,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 05:46:50,565 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.933e+02 2.152e+02 2.474e+02 4.181e+02, threshold=4.303e+02, percent-clipped=1.0 2023-09-29 05:46:50,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:51,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=267826.6666666667, ans=0.125 2023-09-29 05:46:52,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:52,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:46:55,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:55,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:46:55,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:46:57,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:57,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:46:57,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:46:57,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:58,567 INFO [train.py:1039] (2/4) Epoch 8, batch 3000, loss[loss=0.1837, simple_loss=0.2506, pruned_loss=0.0584, over 21060.00 frames. ], tot_loss[loss=0.219, simple_loss=0.2854, pruned_loss=0.07628, over 4716196.73 frames. ], batch size: 46, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:46:58,567 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 05:47:12,756 INFO [train.py:1071] (2/4) Epoch 8, validation: loss=0.3012, simple_loss=0.2865, pruned_loss=0.1579, over 1125622.00 frames. 2023-09-29 05:47:12,757 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 05:47:14,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:47:15,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:15,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 05:47:16,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:18,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:47:20,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:47:23,419 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 05:47:23,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 05:47:25,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:47:26,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:47:26,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 05:47:26,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:30,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=267960.0, ans=0.0 2023-09-29 05:47:32,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:47:45,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:47:51,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 05:47:52,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:47:56,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:47:57,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:57,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:47:59,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:47:59,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 05:48:01,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 05:48:04,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:48:04,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:48:05,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:48:05,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:07,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:07,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:48:11,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:48:12,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:48:12,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:48:13,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:16,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 05:48:16,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:48:17,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:19,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:48:22,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:23,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:25,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:48:25,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 05:48:25,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:48:26,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 05:48:26,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:48:31,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 05:48:33,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=268160.0, ans=0.125 2023-09-29 05:48:34,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:48:34,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:48:34,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 05:48:35,785 INFO [train.py:1039] (2/4) Epoch 8, batch 3050, loss[loss=0.1995, simple_loss=0.2736, pruned_loss=0.06273, over 24451.00 frames. ], tot_loss[loss=0.22, simple_loss=0.2861, pruned_loss=0.07698, over 4711639.66 frames. ], batch size: 66, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:48:37,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 05:48:37,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:48:37,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:48:38,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:38,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:48:38,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:38,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:48:39,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=268226.6666666667, ans=0.2 2023-09-29 05:48:42,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 05:48:44,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:48:47,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:48:47,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:48:51,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:55,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 05:49:03,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 05:49:03,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 05:49:03,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:06,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:49:10,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:10,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:10,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:11,286 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.97 vs. limit=22.5 2023-09-29 05:49:11,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:13,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:49:13,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:14,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:14,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:14,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:17,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:18,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=268360.0, ans=0.125 2023-09-29 05:49:20,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:20,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 05:49:22,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:22,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:49:27,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:49:27,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:49:27,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:49:29,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:34,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:35,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=268426.6666666667, ans=0.0 2023-09-29 05:49:36,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:41,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=268493.3333333333, ans=0.1 2023-09-29 05:49:42,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:42,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:49:42,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:44,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:44,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:49:44,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:45,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 05:49:47,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:48,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:50,195 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.057e+02 2.311e+02 2.662e+02 5.288e+02, threshold=4.621e+02, percent-clipped=1.0 2023-09-29 05:49:50,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 05:49:51,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:56,335 INFO [train.py:1039] (2/4) Epoch 8, batch 3100, loss[loss=0.1826, simple_loss=0.2545, pruned_loss=0.05539, over 24333.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.2857, pruned_loss=0.07633, over 4723369.59 frames. ], batch size: 61, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:49:57,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:58,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:49:59,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:50:02,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 05:50:05,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 05:50:06,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 05:50:08,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:50:13,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:50:13,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:15,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:50:18,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:24,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 05:50:30,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:50:30,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:31,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:31,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:50:33,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:50:33,776 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:50:35,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:50:35,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 05:50:35,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:50:37,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:39,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 05:50:41,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:50:44,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:50:44,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 05:50:44,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=268760.0, ans=0.125 2023-09-29 05:50:46,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 05:50:47,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:47,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:49,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:50:49,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:49,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:50:51,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:50:51,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:50:54,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:50:54,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:50:54,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:54,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 05:50:57,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:59,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 05:51:00,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:51:02,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 05:51:02,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:02,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:04,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 05:51:16,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 05:51:19,612 INFO [train.py:1039] (2/4) Epoch 8, batch 3150, loss[loss=0.1978, simple_loss=0.2515, pruned_loss=0.07202, over 23411.00 frames. ], tot_loss[loss=0.2173, simple_loss=0.2829, pruned_loss=0.07578, over 4701547.25 frames. ], batch size: 285, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:51:21,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:21,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:21,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=268893.3333333333, ans=0.2 2023-09-29 05:51:24,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:51:24,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:51:25,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 05:51:27,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:27,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:51:28,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 05:51:30,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:30,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=268893.3333333333, ans=0.0 2023-09-29 05:51:32,088 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 05:51:35,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 05:51:35,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:51:36,695 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 05:51:38,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:51:38,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 05:51:40,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 05:51:40,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 05:51:40,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:40,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:51:41,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:43,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 05:51:47,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:51,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:51:53,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=269026.6666666667, ans=0.09899494936611666 2023-09-29 05:51:54,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 05:51:54,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:51:55,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:51:57,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:57,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 05:51:59,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 05:52:00,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:52:02,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:52:02,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:52:02,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:02,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:52:03,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:52:03,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:52:05,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 05:52:05,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:52:05,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:07,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:52:07,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:52:09,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 05:52:09,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:12,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 05:52:12,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:13,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 05:52:15,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 05:52:15,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:52:15,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=269093.3333333333, ans=0.125 2023-09-29 05:52:17,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:17,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 05:52:19,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:52:21,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:23,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:52:25,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:26,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:52:31,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:52:31,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:34,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:52:35,511 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.026e+02 2.271e+02 2.794e+02 4.211e+02, threshold=4.543e+02, percent-clipped=0.0 2023-09-29 05:52:38,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:52:38,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:52:42,183 INFO [train.py:1039] (2/4) Epoch 8, batch 3200, loss[loss=0.1888, simple_loss=0.2552, pruned_loss=0.06126, over 23722.00 frames. ], tot_loss[loss=0.2163, simple_loss=0.2817, pruned_loss=0.07551, over 4686900.67 frames. ], batch size: 232, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:52:43,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:46,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:52:46,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 05:52:50,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:57,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:52:58,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=269293.3333333333, ans=0.0 2023-09-29 05:53:02,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:53:05,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=269293.3333333333, ans=0.0 2023-09-29 05:53:07,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=269293.3333333333, ans=0.2 2023-09-29 05:53:08,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=269293.3333333333, ans=0.5 2023-09-29 05:53:11,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:53:13,974 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.21 vs. limit=15.0 2023-09-29 05:53:22,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 05:53:24,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:53:28,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 05:53:29,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:53:31,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:53:31,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:53:32,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:53:38,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 05:53:39,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:53:43,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 05:53:46,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 05:53:47,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:53:52,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:53,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:53:53,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:54,026 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 05:53:54,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:53:54,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=269493.3333333333, ans=0.125 2023-09-29 05:53:57,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=269493.3333333333, ans=0.1 2023-09-29 05:53:59,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:00,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 05:54:00,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 05:54:02,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 05:54:03,733 INFO [train.py:1039] (2/4) Epoch 8, batch 3250, loss[loss=0.2246, simple_loss=0.3053, pruned_loss=0.07194, over 24686.00 frames. ], tot_loss[loss=0.216, simple_loss=0.2817, pruned_loss=0.07512, over 4692633.77 frames. ], batch size: 73, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:54:03,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 05:54:05,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:54:09,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:54:11,018 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 05:54:11,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:11,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:12,742 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 05:54:17,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:54:19,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:26,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:54:26,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 05:54:28,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:28,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:54:28,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:28,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=269626.6666666667, ans=0.0 2023-09-29 05:54:29,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:30,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:54:33,219 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:54:34,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:54:34,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:34,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:35,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:54:38,060 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.20 vs. limit=22.5 2023-09-29 05:54:38,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:39,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:42,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:42,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:44,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:45,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:45,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:54:49,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 05:54:49,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:49,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:54:52,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:52,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:55:00,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:55:08,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:09,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:09,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 05:55:09,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:55:09,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:55:09,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:13,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 05:55:13,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 05:55:13,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:55:14,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:16,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:16,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:55:18,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:19,811 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.998e+02 2.232e+02 2.545e+02 3.931e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 05:55:21,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:23,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:24,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 05:55:24,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:25,843 INFO [train.py:1039] (2/4) Epoch 8, batch 3300, loss[loss=0.2332, simple_loss=0.2899, pruned_loss=0.08825, over 23394.00 frames. ], tot_loss[loss=0.217, simple_loss=0.2829, pruned_loss=0.07555, over 4699055.69 frames. ], batch size: 285, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:55:27,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:55:27,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 05:55:30,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:30,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 05:55:32,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 05:55:33,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 05:55:33,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:38,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:40,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:55:40,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:43,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:55:43,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:55:45,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:45,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=269960.0, ans=0.0 2023-09-29 05:55:46,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:50,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 05:55:50,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:55:51,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:51,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:53,853 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 05:55:55,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:55:55,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:55:56,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:55:56,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:55:57,030 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 05:56:01,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:01,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:56:04,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:04,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 05:56:06,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 05:56:06,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:07,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:56:09,556 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 05:56:11,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 05:56:11,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:15,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 05:56:18,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:19,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:56:21,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:23,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:23,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:23,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:23,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:56:26,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:56:26,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:26,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:56:28,437 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 05:56:29,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 05:56:31,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:56:32,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:56:32,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:34,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:34,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:36,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:56:37,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:37,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:56:37,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:39,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:56:43,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 05:56:44,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:45,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:46,924 INFO [train.py:1039] (2/4) Epoch 8, batch 3350, loss[loss=0.2225, simple_loss=0.2843, pruned_loss=0.08036, over 23504.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2845, pruned_loss=0.07635, over 4704935.65 frames. ], batch size: 106, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:56:47,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:56:47,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:49,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:52,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:52,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:55,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:57,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:58,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:59,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:01,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=270226.6666666667, ans=0.125 2023-09-29 05:57:02,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:57:04,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:04,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:57:05,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 05:57:08,663 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 05:57:08,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:11,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 05:57:11,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 05:57:14,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:57:14,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:57:16,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:16,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 05:57:16,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:16,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:57:18,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:21,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:21,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:23,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:57:25,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:28,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:28,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:33,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=270360.0, ans=0.125 2023-09-29 05:57:34,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:57:36,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:37,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:37,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:39,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:43,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 05:57:43,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:57:43,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 05:57:43,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:57:45,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 05:57:46,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:46,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:52,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=270493.3333333333, ans=0.125 2023-09-29 05:57:54,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:56,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 05:57:56,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:57:58,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:57:59,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:58:02,774 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 2.038e+02 2.381e+02 2.842e+02 4.419e+02, threshold=4.763e+02, percent-clipped=0.0 2023-09-29 05:58:04,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:06,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 05:58:06,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:58:08,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:58:08,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=270560.0, ans=0.0 2023-09-29 05:58:09,746 INFO [train.py:1039] (2/4) Epoch 8, batch 3400, loss[loss=0.3265, simple_loss=0.3575, pruned_loss=0.1478, over 19377.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.2853, pruned_loss=0.07652, over 4707092.99 frames. ], batch size: 389, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:58:09,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:09,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 05:58:11,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:58:12,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 05:58:12,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=270560.0, ans=0.05 2023-09-29 05:58:13,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:58:15,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=270560.0, ans=0.2 2023-09-29 05:58:16,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:58:16,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 05:58:21,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 05:58:22,515 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 05:58:22,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:58:27,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:27,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:58:29,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:30,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:58:37,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:58:38,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=270626.6666666667, ans=0.0 2023-09-29 05:58:39,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 05:58:44,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:58:46,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:46,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:48,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:58:48,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=270693.3333333333, ans=0.125 2023-09-29 05:58:54,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:58:56,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 05:58:59,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=270760.0, ans=0.0 2023-09-29 05:59:01,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=270760.0, ans=0.125 2023-09-29 05:59:03,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.40 vs. limit=6.0 2023-09-29 05:59:03,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 05:59:07,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:07,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:09,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:59:09,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:59:09,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=270760.0, ans=0.0 2023-09-29 05:59:11,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:59:11,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=270760.0, ans=0.1 2023-09-29 05:59:11,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=270760.0, ans=0.1 2023-09-29 05:59:11,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.75 vs. limit=15.0 2023-09-29 05:59:14,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:59:14,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:59:20,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:21,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 05:59:24,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.76 vs. limit=15.0 2023-09-29 05:59:25,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=270826.6666666667, ans=0.035 2023-09-29 05:59:26,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:59:31,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.29 vs. limit=15.0 2023-09-29 05:59:31,907 INFO [train.py:1039] (2/4) Epoch 8, batch 3450, loss[loss=0.2256, simple_loss=0.2902, pruned_loss=0.08047, over 23227.00 frames. ], tot_loss[loss=0.2196, simple_loss=0.2859, pruned_loss=0.07665, over 4701737.59 frames. ], batch size: 119, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:59:32,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 05:59:38,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 05:59:38,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:38,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=270893.3333333333, ans=0.2 2023-09-29 05:59:40,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:59:40,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 05:59:42,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:45,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:59:49,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=270960.0, ans=0.2 2023-09-29 05:59:50,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:59:52,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:59:53,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:59:53,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:55,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:00:02,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 06:00:02,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=270960.0, ans=0.125 2023-09-29 06:00:06,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 06:00:06,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:00:08,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:00:09,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:15,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 06:00:15,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:00:20,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:20,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:00:23,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:00:25,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:00:26,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 06:00:26,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:28,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:00:31,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:00:33,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 06:00:36,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:00:41,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:00:43,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:44,480 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=15.0 2023-09-29 06:00:46,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:00:47,986 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.293e+02 2.662e+02 4.290e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 06:00:51,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:51,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:52,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:00:52,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:53,275 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:00:54,252 INFO [train.py:1039] (2/4) Epoch 8, batch 3500, loss[loss=0.2108, simple_loss=0.276, pruned_loss=0.07282, over 23315.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2844, pruned_loss=0.076, over 4716916.50 frames. ], batch size: 106, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:00:56,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:00:59,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:01:01,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 06:01:03,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:01:04,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=271226.6666666667, ans=0.125 2023-09-29 06:01:06,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:01:09,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:01:09,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 06:01:14,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:01:16,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:01:16,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:01:17,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:17,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:01:17,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:18,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=271293.3333333333, ans=0.0 2023-09-29 06:01:19,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:19,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 06:01:22,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:22,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:01:24,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:28,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:28,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 06:01:29,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:32,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:34,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:01:36,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:39,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:01:39,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:40,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 06:01:42,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 06:01:43,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 06:01:43,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:46,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:46,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:46,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:01:47,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=271426.6666666667, ans=0.125 2023-09-29 06:01:50,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:01:51,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:01:57,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:01:57,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=271426.6666666667, ans=0.2 2023-09-29 06:01:58,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 06:01:58,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 06:01:58,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:03,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:05,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:06,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:08,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 06:02:08,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:10,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:02:10,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=271493.3333333333, ans=0.04949747468305833 2023-09-29 06:02:11,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 06:02:13,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 06:02:16,402 INFO [train.py:1039] (2/4) Epoch 8, batch 3550, loss[loss=0.2318, simple_loss=0.2961, pruned_loss=0.08379, over 23847.00 frames. ], tot_loss[loss=0.2171, simple_loss=0.2832, pruned_loss=0.07553, over 4706720.31 frames. ], batch size: 195, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:02:16,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:18,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:19,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:19,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:22,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:02:31,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:32,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:02:33,596 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.13 vs. limit=22.5 2023-09-29 06:02:36,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:02:37,490 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.31 vs. limit=22.5 2023-09-29 06:02:37,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:02:39,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:41,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:02:41,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:02:44,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:45,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:02:45,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:47,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:02:47,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:02:48,538 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.23 vs. limit=6.0 2023-09-29 06:02:51,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=271693.3333333333, ans=0.07 2023-09-29 06:02:52,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:02:52,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:54,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:02:54,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:55,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:02:55,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 06:02:55,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:57,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:57,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=271693.3333333333, ans=0.0 2023-09-29 06:02:58,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 06:02:59,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=271693.3333333333, ans=0.0 2023-09-29 06:03:04,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=271760.0, ans=0.0 2023-09-29 06:03:05,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:07,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:03:08,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:08,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 06:03:10,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:03:12,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 06:03:12,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:03:15,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:03:15,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:03:19,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 06:03:20,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:24,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:25,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 06:03:25,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:30,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:03:31,728 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.035e+02 2.264e+02 2.526e+02 3.446e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 06:03:31,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 06:03:36,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=271826.6666666667, ans=0.1 2023-09-29 06:03:39,313 INFO [train.py:1039] (2/4) Epoch 8, batch 3600, loss[loss=0.206, simple_loss=0.2743, pruned_loss=0.06885, over 23584.00 frames. ], tot_loss[loss=0.2171, simple_loss=0.2837, pruned_loss=0.07529, over 4715313.68 frames. ], batch size: 149, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:03:40,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 06:03:41,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:03:41,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:03:44,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:44,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:46,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:03:51,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:03:52,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:54,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:03:54,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:03:55,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:55,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 06:04:00,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:04:02,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:04,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:06,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:07,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:04:07,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:04:07,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 06:04:09,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:11,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:13,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:04:16,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:17,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:19,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:20,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 06:04:27,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:29,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:04:29,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 06:04:34,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:04:41,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:42,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:47,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:04:47,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:04:49,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 06:04:49,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 06:04:52,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 06:04:54,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:54,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:04:55,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 06:04:57,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:04:57,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:04:58,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:59,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 06:04:59,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=272160.0, ans=0.2 2023-09-29 06:05:00,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 06:05:01,996 INFO [train.py:1039] (2/4) Epoch 8, batch 3650, loss[loss=0.2271, simple_loss=0.2816, pruned_loss=0.08631, over 23743.00 frames. ], tot_loss[loss=0.2166, simple_loss=0.2836, pruned_loss=0.0748, over 4727159.40 frames. ], batch size: 232, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:05:03,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:05:03,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 06:05:07,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 06:05:09,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:05:12,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 06:05:14,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 06:05:19,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:19,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:05:19,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:05:24,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 06:05:25,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:05:26,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 06:05:28,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:05:28,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:30,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 06:05:31,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:05:32,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:05:32,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:35,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:05:35,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=272360.0, ans=0.0 2023-09-29 06:05:36,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 06:05:38,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 06:05:39,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:05:41,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 06:05:43,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:05:43,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:05:48,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:05:50,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:50,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:05:50,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=272426.6666666667, ans=0.125 2023-09-29 06:05:51,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:05:53,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:05:55,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:05:57,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:58,021 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.70 vs. limit=6.0 2023-09-29 06:05:58,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:58,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:06:00,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:06:03,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:06:03,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:11,359 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 06:06:14,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:16,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:16,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:06:16,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:17,952 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.110e+02 2.350e+02 2.595e+02 3.564e+02, threshold=4.700e+02, percent-clipped=0.0 2023-09-29 06:06:18,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:06:18,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:18,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=272493.3333333333, ans=0.1 2023-09-29 06:06:21,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 06:06:22,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:23,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:06:25,106 INFO [train.py:1039] (2/4) Epoch 8, batch 3700, loss[loss=0.2625, simple_loss=0.3107, pruned_loss=0.1071, over 23788.00 frames. ], tot_loss[loss=0.2171, simple_loss=0.2842, pruned_loss=0.07494, over 4736616.52 frames. ], batch size: 149, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:06:26,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:06:28,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:06:30,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:30,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 06:06:30,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:31,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:06:32,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:06:36,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:06:40,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:41,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:41,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:06:41,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:43,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:06:46,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:48,225 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 06:06:57,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:06:57,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:06:58,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:06:58,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 06:06:58,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:07:01,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:03,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 06:07:05,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:07,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:07:10,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:10,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:07:10,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=272693.3333333333, ans=0.1 2023-09-29 06:07:13,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:07:18,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:07:18,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 06:07:18,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:07:19,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 06:07:24,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:07:24,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=272760.0, ans=0.2 2023-09-29 06:07:25,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:07:29,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:29,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 06:07:32,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:07:32,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:07:32,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:32,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:37,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:39,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 06:07:39,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 06:07:41,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:07:41,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:07:42,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:07:42,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:07:46,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:47,551 INFO [train.py:1039] (2/4) Epoch 8, batch 3750, loss[loss=0.227, simple_loss=0.2825, pruned_loss=0.08573, over 23742.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.2855, pruned_loss=0.07561, over 4741493.27 frames. ], batch size: 212, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:07:47,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:07:49,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:07:52,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 06:07:54,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:07:57,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:07:57,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 06:07:58,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:08:00,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:00,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=272893.3333333333, ans=10.0 2023-09-29 06:08:01,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:05,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:07,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:10,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:08:12,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:08:16,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:08:18,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:20,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 06:08:20,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:20,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=273026.6666666667, ans=0.035 2023-09-29 06:08:21,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-09-29 06:08:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:22,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:24,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.02 vs. limit=22.5 2023-09-29 06:08:25,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 06:08:26,176 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.42 vs. limit=15.0 2023-09-29 06:08:30,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 06:08:31,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:31,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:33,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:37,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:41,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:08:44,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 06:08:46,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:49,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:51,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:08:54,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:08:56,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=273160.0, ans=0.0 2023-09-29 06:08:57,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:08:59,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:09:01,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:09:02,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:09:03,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=273160.0, ans=0.2 2023-09-29 06:09:04,243 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.271e+02 2.610e+02 3.277e+02 5.264e+02, threshold=5.220e+02, percent-clipped=1.0 2023-09-29 06:09:05,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:09:06,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=273160.0, ans=0.0 2023-09-29 06:09:09,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=273226.6666666667, ans=0.125 2023-09-29 06:09:11,184 INFO [train.py:1039] (2/4) Epoch 8, batch 3800, loss[loss=0.2247, simple_loss=0.3038, pruned_loss=0.07279, over 24450.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2852, pruned_loss=0.07555, over 4731141.12 frames. ], batch size: 69, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:09:11,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=273226.6666666667, ans=0.1 2023-09-29 06:09:13,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=273226.6666666667, ans=0.0 2023-09-29 06:09:16,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:09:20,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:22,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:09:23,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 06:09:24,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:27,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:27,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:09:30,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:09:30,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:30,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:09:32,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:32,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:09:32,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:34,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 06:09:39,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:09:39,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:09:41,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:45,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:09:47,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:09:49,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:09:49,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:49,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=273360.0, ans=0.2 2023-09-29 06:09:51,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:52,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:55,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=273360.0, ans=0.0 2023-09-29 06:09:57,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:09:57,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 06:10:00,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:04,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=273426.6666666667, ans=0.0 2023-09-29 06:10:05,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:11,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:10:14,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 06:10:15,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 06:10:15,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:19,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:20,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:20,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 06:10:24,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 06:10:24,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 06:10:26,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:28,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:32,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:10:34,260 INFO [train.py:1039] (2/4) Epoch 8, batch 3850, loss[loss=0.227, simple_loss=0.3027, pruned_loss=0.07568, over 24073.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2842, pruned_loss=0.0754, over 4721436.69 frames. ], batch size: 80, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:10:34,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:10:40,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:10:41,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 06:10:42,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:10:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:44,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=273560.0, ans=0.125 2023-09-29 06:10:47,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:10:48,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:50,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:10:53,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 06:11:00,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:02,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:11:05,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:05,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:11:09,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:10,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:11:11,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:11,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:11:13,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:14,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:15,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:15,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:11:16,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 06:11:16,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 06:11:16,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.14 vs. limit=10.0 2023-09-29 06:11:18,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:18,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:21,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 06:11:24,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 06:11:26,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:28,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 06:11:30,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:11:35,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:37,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:43,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:43,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 06:11:45,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 06:11:48,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:48,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:48,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=273826.6666666667, ans=0.2 2023-09-29 06:11:50,370 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.044e+02 2.316e+02 2.829e+02 5.158e+02, threshold=4.631e+02, percent-clipped=0.0 2023-09-29 06:11:52,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:11:52,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:11:52,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:11:53,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 06:11:55,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:55,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=273893.3333333333, ans=0.1 2023-09-29 06:11:56,710 INFO [train.py:1039] (2/4) Epoch 8, batch 3900, loss[loss=0.2072, simple_loss=0.2749, pruned_loss=0.06971, over 23286.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2833, pruned_loss=0.07444, over 4737472.90 frames. ], batch size: 105, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:11:56,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 06:11:56,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:56,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:57,237 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:11:57,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.87 vs. limit=22.5 2023-09-29 06:11:58,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:11:59,316 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.72 vs. limit=22.5 2023-09-29 06:11:59,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:02,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:12:02,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:12:02,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:12:04,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:04,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 06:12:05,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:08,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:10,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:10,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:12:11,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:13,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:15,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:16,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:12:18,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 06:12:18,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:19,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 06:12:20,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:21,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 06:12:23,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 06:12:25,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=273960.0, ans=0.02 2023-09-29 06:12:28,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:29,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:29,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:12:31,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:12:34,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:36,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:12:39,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:12:39,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:12:39,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:12:45,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:45,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:12:53,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:12:55,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:13:05,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:05,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=274160.0, ans=0.1 2023-09-29 06:13:08,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:08,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 06:13:08,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 06:13:08,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:08,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=274160.0, ans=0.1 2023-09-29 06:13:10,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 06:13:12,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:13:12,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=274160.0, ans=0.2 2023-09-29 06:13:13,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 06:13:15,752 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:13:20,395 INFO [train.py:1039] (2/4) Epoch 8, batch 3950, loss[loss=0.2178, simple_loss=0.2791, pruned_loss=0.07828, over 23417.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2831, pruned_loss=0.07452, over 4740664.38 frames. ], batch size: 120, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:13:22,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:13:22,234 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=274226.6666666667, ans=0.1 2023-09-29 06:13:23,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 06:13:23,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:13:26,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:13:29,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:13:36,107 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 06:13:36,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:36,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 06:13:37,595 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 06:13:37,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:40,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:40,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:13:40,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:45,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 06:13:47,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:13:47,960 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.51 vs. limit=15.0 2023-09-29 06:13:49,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:49,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:13:50,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:13:52,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:13:54,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=274360.0, ans=0.125 2023-09-29 06:14:03,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:14:03,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:14:07,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 06:14:12,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 06:14:12,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 06:14:14,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:14:15,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:14:24,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:14:24,769 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:14:25,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:14:26,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:14:26,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:14:26,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 06:14:31,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:14:32,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:14:35,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=274493.3333333333, ans=0.0 2023-09-29 06:14:36,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 06:14:37,527 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.764e+02 2.191e+02 2.483e+02 2.950e+02 4.567e+02, threshold=4.966e+02, percent-clipped=0.0 2023-09-29 06:14:42,737 INFO [train.py:1039] (2/4) Epoch 8, batch 4000, loss[loss=0.2284, simple_loss=0.2875, pruned_loss=0.08466, over 23456.00 frames. ], tot_loss[loss=0.2176, simple_loss=0.2846, pruned_loss=0.07531, over 4731255.75 frames. ], batch size: 285, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:14:46,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:14:48,534 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.62 vs. limit=15.0 2023-09-29 06:14:55,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:15:02,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:02,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:15:02,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:15:03,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 06:15:03,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:15:04,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 06:15:06,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:15:06,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 06:15:07,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:11,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:15:11,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:15:11,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:15:12,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:12,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:15:14,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:15:15,889 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 06:15:17,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:15:17,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:21,130 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 06:15:22,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:15:22,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:28,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 06:15:31,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:32,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:15:34,628 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 06:15:36,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:15:36,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 06:15:36,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:15:36,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:37,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:15:40,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:15:41,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:15:41,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:43,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 06:15:43,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:46,115 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 06:15:46,872 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.83 vs. limit=15.0 2023-09-29 06:15:52,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:15:54,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:15:57,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:15:57,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:58,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:16:00,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=274826.6666666667, ans=0.125 2023-09-29 06:16:01,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:04,754 INFO [train.py:1039] (2/4) Epoch 8, batch 4050, loss[loss=0.2001, simple_loss=0.2836, pruned_loss=0.0583, over 24611.00 frames. ], tot_loss[loss=0.219, simple_loss=0.2857, pruned_loss=0.07615, over 4732547.25 frames. ], batch size: 68, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:16:08,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:16:11,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:16:11,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 06:16:13,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:16:15,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:17,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:16:18,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:18,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:22,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:26,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:16:26,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:16:30,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:16:30,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:16:32,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:35,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:38,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 06:16:38,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 06:16:39,936 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 06:16:40,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=275026.6666666667, ans=0.0 2023-09-29 06:16:41,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:16:49,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 06:16:50,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:16:52,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=275026.6666666667, ans=0.125 2023-09-29 06:16:55,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:58,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:17:00,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:00,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:17:03,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:17:06,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 06:17:06,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:17:08,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:08,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 06:17:10,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=275160.0, ans=0.125 2023-09-29 06:17:13,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:21,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 06:17:21,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=275160.0, ans=0.2 2023-09-29 06:17:23,487 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.014e+02 2.168e+02 2.447e+02 3.458e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 06:17:23,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:17:23,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:17:27,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 06:17:27,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 06:17:27,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:28,554 INFO [train.py:1039] (2/4) Epoch 8, batch 4100, loss[loss=0.3126, simple_loss=0.3466, pruned_loss=0.1393, over 19306.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2854, pruned_loss=0.07596, over 4720497.15 frames. ], batch size: 388, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:17:28,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:17:30,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:30,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:17:36,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 06:17:38,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 06:17:38,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 06:17:40,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 06:17:40,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:40,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:17:42,164 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 06:17:44,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:45,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:17:46,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:47,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:17:51,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:17:53,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:53,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:17:53,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 06:17:53,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=275293.3333333333, ans=0.125 2023-09-29 06:17:57,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:57,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:17:57,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:17:57,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:59,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 06:18:00,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:02,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 06:18:03,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:18:06,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:18:06,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 06:18:08,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:18:08,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:18:08,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:18:11,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 06:18:13,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=275360.0, ans=0.125 2023-09-29 06:18:14,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:18:14,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:18:16,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 06:18:16,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=275426.6666666667, ans=0.125 2023-09-29 06:18:18,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:18:18,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:18,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=275426.6666666667, ans=0.125 2023-09-29 06:18:23,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:27,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:30,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=275426.6666666667, ans=0.2 2023-09-29 06:18:32,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:32,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:18:42,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:18:42,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:45,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:45,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=275493.3333333333, ans=0.0 2023-09-29 06:18:47,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:18:49,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=275560.0, ans=0.125 2023-09-29 06:18:50,495 INFO [train.py:1039] (2/4) Epoch 8, batch 4150, loss[loss=0.2651, simple_loss=0.3057, pruned_loss=0.1122, over 19708.00 frames. ], tot_loss[loss=0.2194, simple_loss=0.2858, pruned_loss=0.07648, over 4708381.66 frames. ], batch size: 388, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:18:50,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:52,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:18:54,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:18:54,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:18:54,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=275560.0, ans=0.125 2023-09-29 06:18:57,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 06:18:58,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:58,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 06:19:00,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 06:19:00,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 06:19:01,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:19:04,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=275560.0, ans=0.0 2023-09-29 06:19:08,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:19:08,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:13,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:14,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:14,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:19:16,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:19:16,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:19:17,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:19:21,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:25,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:28,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 06:19:29,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 06:19:29,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:19:30,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 06:19:30,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:19:30,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:33,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:34,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:38,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 06:19:42,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:19:43,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:19:45,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 06:19:45,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:47,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 06:19:48,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:19:50,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:50,736 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.72 vs. limit=15.0 2023-09-29 06:19:51,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:51,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 06:19:51,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:51,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:19:55,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:19:56,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 06:19:56,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:56,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:19:56,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:19:58,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 06:20:00,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:20:00,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:20:01,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:03,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:20:03,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 06:20:03,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:20:09,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:20:11,682 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 2.232e+02 3.048e+02 3.830e+02 6.363e+02, threshold=6.096e+02, percent-clipped=13.0 2023-09-29 06:20:13,738 INFO [train.py:1039] (2/4) Epoch 8, batch 4200, loss[loss=0.2136, simple_loss=0.293, pruned_loss=0.06712, over 24597.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.2844, pruned_loss=0.07624, over 4706434.75 frames. ], batch size: 71, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:20:13,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 06:20:15,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:20:17,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:19,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:20:20,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:20,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:23,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 06:20:23,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=275893.3333333333, ans=0.125 2023-09-29 06:20:26,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 06:20:26,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:29,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:31,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:20:35,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:20:38,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:20:38,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:38,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 06:20:38,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:38,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:40,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:40,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:20:40,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=275960.0, ans=0.0 2023-09-29 06:20:41,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:20:43,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 06:20:43,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:46,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=276026.6666666667, ans=0.2 2023-09-29 06:20:48,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:20:50,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:20:53,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:20:53,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:55,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:20:55,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 06:20:55,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:20:58,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:20:58,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=276026.6666666667, ans=0.2 2023-09-29 06:21:03,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:21:05,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:10,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=276093.3333333333, ans=0.125 2023-09-29 06:21:10,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=276093.3333333333, ans=0.09899494936611666 2023-09-29 06:21:13,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:21:16,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 06:21:18,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:21:23,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:21:24,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:28,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 06:21:29,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=276160.0, ans=0.0 2023-09-29 06:21:32,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:21:34,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=276226.6666666667, ans=0.125 2023-09-29 06:21:35,629 INFO [train.py:1039] (2/4) Epoch 8, batch 4250, loss[loss=0.2191, simple_loss=0.2712, pruned_loss=0.08351, over 22807.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2827, pruned_loss=0.07601, over 4702327.82 frames. ], batch size: 322, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:21:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:37,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:21:41,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:45,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:21:47,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 06:21:47,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:21:52,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:55,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:21:59,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:21:59,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:02,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:22:02,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:05,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:05,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:07,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:08,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:22:10,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:12,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 06:22:17,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 06:22:17,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:17,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:17,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:18,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:22:20,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:20,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:24,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:22:25,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:22:28,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:30,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:32,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 06:22:32,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:22:33,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 06:22:35,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:22:36,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:22:38,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:38,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:41,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 06:22:43,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:22:43,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:22:46,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:50,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:51,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:22:53,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:53,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:22:55,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:22:56,719 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.004e+02 2.225e+02 2.717e+02 4.251e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 06:22:56,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:22:56,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 06:22:57,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:58,468 INFO [train.py:1039] (2/4) Epoch 8, batch 4300, loss[loss=0.2497, simple_loss=0.3023, pruned_loss=0.09854, over 23798.00 frames. ], tot_loss[loss=0.2176, simple_loss=0.2832, pruned_loss=0.07607, over 4702043.14 frames. ], batch size: 164, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:23:05,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:23:05,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:07,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=276560.0, ans=0.125 2023-09-29 06:23:08,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:23:10,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=276560.0, ans=0.0 2023-09-29 06:23:16,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:23:16,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 06:23:16,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:23:20,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:23:20,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:23:20,715 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 06:23:25,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:23:27,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:23:30,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 06:23:31,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:23:31,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 06:23:32,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:23:35,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:23:38,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:23:38,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:40,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:23:43,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:45,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:23:45,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 06:23:45,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 06:23:46,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:23:51,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:23:51,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:51,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 06:23:51,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 06:23:52,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 06:23:53,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:23:53,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 06:23:53,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 06:23:54,007 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.74 vs. limit=15.0 2023-09-29 06:23:57,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:23:59,248 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 06:24:01,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:24:04,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:04,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:24:05,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 06:24:07,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:24:07,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:08,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:08,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:08,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:24:11,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:24:15,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:16,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:16,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:20,327 INFO [train.py:1039] (2/4) Epoch 8, batch 4350, loss[loss=0.2365, simple_loss=0.3032, pruned_loss=0.08495, over 23973.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.2844, pruned_loss=0.07661, over 4705383.59 frames. ], batch size: 86, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:24:22,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 06:24:22,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:24:26,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:27,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=276893.3333333333, ans=0.125 2023-09-29 06:24:31,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:33,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:24:33,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:24:33,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=276893.3333333333, ans=0.2 2023-09-29 06:24:39,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:24:44,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:47,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:24:47,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:49,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:24:52,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:24:53,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:24:57,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 06:24:57,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:59,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:05,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:07,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 06:25:08,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=277093.3333333333, ans=0.125 2023-09-29 06:25:11,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:11,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:25:15,925 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 06:25:17,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:17,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:25:18,949 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 06:25:20,346 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 06:25:20,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:20,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:21,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:25:23,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:23,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:23,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:25:27,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 06:25:27,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:27,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:28,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:28,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 06:25:30,084 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 06:25:30,090 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 06:25:30,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 06:25:33,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:25:33,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:25:35,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:25:35,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:25:35,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=277160.0, ans=0.125 2023-09-29 06:25:35,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=277160.0, ans=0.1 2023-09-29 06:25:38,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 06:25:39,661 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.760e+02 2.095e+02 2.321e+02 2.736e+02 4.922e+02, threshold=4.641e+02, percent-clipped=1.0 2023-09-29 06:25:41,148 INFO [train.py:1039] (2/4) Epoch 8, batch 4400, loss[loss=0.2195, simple_loss=0.2935, pruned_loss=0.07274, over 24480.00 frames. ], tot_loss[loss=0.22, simple_loss=0.2859, pruned_loss=0.07701, over 4696921.33 frames. ], batch size: 66, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:25:41,246 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 06:25:41,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:46,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:46,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:48,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:51,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 06:25:51,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 06:25:51,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 06:25:51,597 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 06:25:53,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:25:53,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:54,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 06:25:56,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:56,658 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:25:57,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:59,262 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 06:26:00,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:00,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 06:26:01,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=277293.3333333333, ans=0.2 2023-09-29 06:26:01,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=277293.3333333333, ans=0.0 2023-09-29 06:26:02,775 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 06:26:06,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 06:26:06,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 06:26:06,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 06:26:06,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:08,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:10,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 06:26:11,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 06:26:12,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:15,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:26:15,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:18,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:18,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:18,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 06:26:18,759 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 06:26:18,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=277360.0, ans=0.2 2023-09-29 06:26:18,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=277360.0, ans=0.0 2023-09-29 06:26:23,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:29,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:33,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 06:26:38,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:26:40,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:26:42,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:26:44,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 06:26:44,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:26:44,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:26:44,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:26:45,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:26:50,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 06:26:54,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 06:26:55,503 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=12.0 2023-09-29 06:26:55,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 06:26:56,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:56,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 06:26:56,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:26:57,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.52 vs. limit=15.0 2023-09-29 06:26:59,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:27:00,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 06:27:03,749 INFO [train.py:1039] (2/4) Epoch 8, batch 4450, loss[loss=0.3086, simple_loss=0.3445, pruned_loss=0.1363, over 19569.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2864, pruned_loss=0.07707, over 4693926.88 frames. ], batch size: 388, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:27:03,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:27:06,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:08,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:27:15,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:16,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:27:20,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:22,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:27:27,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:27:27,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:27,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 06:27:27,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:27,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:27,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:27:27,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:27:30,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:27:32,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=277626.6666666667, ans=0.1 2023-09-29 06:27:36,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:36,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:38,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:38,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:40,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:27:45,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:27:46,034 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.41 vs. limit=22.5 2023-09-29 06:27:46,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 06:27:46,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 06:27:46,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:27:51,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:53,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 06:27:56,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:28:00,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:02,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 06:28:02,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:02,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:02,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:28:02,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:28:02,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=277760.0, ans=0.2 2023-09-29 06:28:05,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:06,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=277760.0, ans=0.125 2023-09-29 06:28:08,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:28:08,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 06:28:10,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:28:11,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:13,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:14,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:16,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:28:19,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:28:21,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 06:28:24,957 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.112e+02 2.512e+02 3.151e+02 6.272e+02, threshold=5.024e+02, percent-clipped=2.0 2023-09-29 06:28:25,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:28:26,472 INFO [train.py:1039] (2/4) Epoch 8, batch 4500, loss[loss=0.1965, simple_loss=0.2709, pruned_loss=0.06099, over 24482.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.2859, pruned_loss=0.07638, over 4717539.38 frames. ], batch size: 63, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:28:28,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:29,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 06:28:29,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 06:28:32,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:39,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:39,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:41,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:28:41,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:28:41,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:42,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:48,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=277960.0, ans=0.1 2023-09-29 06:28:48,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=277960.0, ans=0.125 2023-09-29 06:28:53,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:55,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:28:57,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:57,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:28:59,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:29:05,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:29:07,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=278026.6666666667, ans=0.07 2023-09-29 06:29:11,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:29:15,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:29:18,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:29:20,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 06:29:20,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:21,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:23,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:23,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:29:26,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:29:26,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 06:29:26,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:29:26,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:31,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:29:31,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:29:34,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:37,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:29:37,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:29:39,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 06:29:39,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 06:29:39,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 06:29:39,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=278160.0, ans=0.125 2023-09-29 06:29:45,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 06:29:48,372 INFO [train.py:1039] (2/4) Epoch 8, batch 4550, loss[loss=0.2142, simple_loss=0.2883, pruned_loss=0.07008, over 24667.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2841, pruned_loss=0.07565, over 4720549.93 frames. ], batch size: 65, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:29:48,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 06:29:49,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:29:50,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=278226.6666666667, ans=0.125 2023-09-29 06:29:53,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:53,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:56,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:29:58,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=278226.6666666667, ans=0.125 2023-09-29 06:30:00,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:30:04,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:30:06,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:06,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:30:06,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:06,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=278293.3333333333, ans=0.125 2023-09-29 06:30:09,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:09,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:30:12,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:16,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 06:30:18,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 06:30:18,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:30:21,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 06:30:22,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 06:30:24,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:27,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 06:30:29,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:30:32,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:30:35,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 06:30:39,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:40,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:40,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:44,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:44,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 06:30:44,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 06:30:45,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:30:45,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 06:30:49,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 06:30:49,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:51,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:51,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:53,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:53,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:30:55,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:30:55,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 06:30:56,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:56,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:30:58,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 06:30:58,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:30:58,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 06:31:01,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:31:01,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:31:04,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:31:04,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:31:04,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:31:05,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:31:09,474 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.931e+02 2.102e+02 2.382e+02 3.783e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 06:31:09,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:31:11,126 INFO [train.py:1039] (2/4) Epoch 8, batch 4600, loss[loss=0.2213, simple_loss=0.3021, pruned_loss=0.07019, over 24324.00 frames. ], tot_loss[loss=0.2157, simple_loss=0.282, pruned_loss=0.07476, over 4712373.75 frames. ], batch size: 74, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:31:11,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:12,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:31:16,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:31:16,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:31:16,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:16,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 06:31:19,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:31:22,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.28 vs. limit=22.5 2023-09-29 06:31:24,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:31:24,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:27,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:33,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 06:31:35,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:38,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:40,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:31:42,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:48,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 06:31:48,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:31:50,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:31:52,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=278693.3333333333, ans=0.0 2023-09-29 06:31:56,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:56,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:31:58,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:32:01,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 06:32:02,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:32:07,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:09,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:32:11,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:11,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 06:32:12,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:12,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 06:32:13,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:15,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:15,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=278826.6666666667, ans=0.1 2023-09-29 06:32:18,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:18,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:32:18,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:20,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 06:32:20,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 06:32:20,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 06:32:20,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:21,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:21,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:23,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:33,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=278893.3333333333, ans=0.125 2023-09-29 06:32:34,346 INFO [train.py:1039] (2/4) Epoch 8, batch 4650, loss[loss=0.2211, simple_loss=0.3015, pruned_loss=0.07037, over 24385.00 frames. ], tot_loss[loss=0.2154, simple_loss=0.2816, pruned_loss=0.07458, over 4711250.64 frames. ], batch size: 77, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:32:35,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:32:38,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:40,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:40,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:32:41,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:41,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:42,337 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.39 vs. limit=15.0 2023-09-29 06:32:43,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:46,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 06:32:50,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:32:53,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 06:32:53,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:53,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 06:32:54,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:32:54,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 06:32:54,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 06:32:54,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:56,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:32:56,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=278960.0, ans=0.125 2023-09-29 06:32:56,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=278960.0, ans=0.2 2023-09-29 06:32:57,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:32:58,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.78 vs. limit=12.0 2023-09-29 06:33:01,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:01,465 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 06:33:04,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:06,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 06:33:08,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:08,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:33:11,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 06:33:13,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:33:16,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:33:18,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:24,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:27,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:28,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:28,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:33:31,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 06:33:31,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 06:33:31,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 06:33:31,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 06:33:33,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:33,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=279093.3333333333, ans=0.0 2023-09-29 06:33:39,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:33:39,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:33:39,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 06:33:39,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:43,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:43,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:33:43,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:33:46,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:33:46,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:47,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:48,550 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.19 vs. limit=15.0 2023-09-29 06:33:51,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=279160.0, ans=0.125 2023-09-29 06:33:52,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:52,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:33:52,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:33:53,551 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 2.046e+02 2.215e+02 2.491e+02 3.733e+02, threshold=4.429e+02, percent-clipped=0.0 2023-09-29 06:33:53,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 06:33:55,132 INFO [train.py:1039] (2/4) Epoch 8, batch 4700, loss[loss=0.1866, simple_loss=0.2656, pruned_loss=0.05377, over 24479.00 frames. ], tot_loss[loss=0.2154, simple_loss=0.2824, pruned_loss=0.07423, over 4724076.77 frames. ], batch size: 63, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:33:55,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:33:57,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 06:34:05,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:07,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:34:07,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:34:07,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=279226.6666666667, ans=0.125 2023-09-29 06:34:09,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:09,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:34:09,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=279226.6666666667, ans=0.125 2023-09-29 06:34:16,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 06:34:17,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 06:34:19,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:22,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:34:22,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:34:23,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.51 vs. limit=6.0 2023-09-29 06:34:23,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:24,985 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.32 vs. limit=15.0 2023-09-29 06:34:28,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:34:29,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:34:33,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:43,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 06:34:44,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:34:46,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:46,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=279426.6666666667, ans=0.1 2023-09-29 06:34:49,602 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:34:51,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 06:34:53,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:34:56,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:34:57,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 06:34:57,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=279426.6666666667, ans=0.0 2023-09-29 06:34:59,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:59,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:02,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:35:02,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:35:02,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 06:35:02,423 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 06:35:05,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:07,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 06:35:08,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:10,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.13 vs. limit=6.0 2023-09-29 06:35:12,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 06:35:15,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:35:15,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:17,384 INFO [train.py:1039] (2/4) Epoch 8, batch 4750, loss[loss=0.2217, simple_loss=0.2811, pruned_loss=0.08117, over 23581.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2836, pruned_loss=0.07435, over 4728803.91 frames. ], batch size: 256, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:35:21,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:21,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:35:24,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 06:35:24,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:35:26,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=279560.0, ans=0.0 2023-09-29 06:35:27,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 06:35:29,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:35:30,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:31,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:38,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 06:35:42,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:35:45,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 06:35:46,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:50,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:51,644 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 06:35:51,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 06:36:00,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 06:36:04,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:06,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:09,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:36:09,740 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 06:36:09,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:12,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:36:15,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:36:16,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 06:36:17,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 06:36:17,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:36:18,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:36:18,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:19,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=279760.0, ans=0.125 2023-09-29 06:36:20,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:36:22,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 06:36:22,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 06:36:25,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:26,464 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.92 vs. limit=15.0 2023-09-29 06:36:27,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:36:27,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 06:36:27,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:36:29,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:30,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:36:32,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:34,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:36:38,030 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.176e+02 2.410e+02 2.744e+02 3.912e+02, threshold=4.820e+02, percent-clipped=0.0 2023-09-29 06:36:38,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:39,647 INFO [train.py:1039] (2/4) Epoch 8, batch 4800, loss[loss=0.1964, simple_loss=0.2641, pruned_loss=0.06433, over 21594.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2846, pruned_loss=0.07523, over 4712027.48 frames. ], batch size: 47, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:36:39,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 06:36:41,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 06:36:41,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=279893.3333333333, ans=0.125 2023-09-29 06:36:42,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 06:36:44,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:36:44,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:45,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 06:36:51,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:52,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=279893.3333333333, ans=0.0 2023-09-29 06:36:53,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:59,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:36:59,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:59,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:01,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 06:37:01,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:37:03,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:37:04,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:37:08,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:10,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:10,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:37:12,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:12,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:37:12,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:12,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:16,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:18,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:37:21,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:37:23,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:24,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 06:37:24,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 06:37:26,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:26,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:37:26,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:37:26,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:26,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:37:28,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=280093.3333333333, ans=0.0 2023-09-29 06:37:29,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:37:29,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:31,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=280093.3333333333, ans=0.1 2023-09-29 06:37:33,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:36,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:39,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:41,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=280093.3333333333, ans=0.0 2023-09-29 06:37:43,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 06:37:43,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:45,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:45,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:37:46,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:49,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:50,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:37:50,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:51,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:37:51,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:37:53,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:37:54,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:55,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.60 vs. limit=15.0 2023-09-29 06:37:56,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:56,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:57,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 06:38:00,236 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.47 vs. limit=6.0 2023-09-29 06:38:00,678 INFO [train.py:1039] (2/4) Epoch 8, batch 4850, loss[loss=0.2255, simple_loss=0.305, pruned_loss=0.07301, over 24679.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.2854, pruned_loss=0.07605, over 4713421.41 frames. ], batch size: 73, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:38:00,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 06:38:00,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:00,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:00,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:00,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:05,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:38:13,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 06:38:16,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:21,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:22,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:38:22,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:26,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:28,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:38:29,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:38:29,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 06:38:29,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=280293.3333333333, ans=0.125 2023-09-29 06:38:29,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=280293.3333333333, ans=0.125 2023-09-29 06:38:32,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:35,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:38:37,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:38:37,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:38:37,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 06:38:40,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:40,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 06:38:45,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 06:38:46,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:38:49,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=280426.6666666667, ans=0.2 2023-09-29 06:38:55,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:38:55,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 06:38:57,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:57,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:38:58,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:39:01,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 06:39:01,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:02,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=280426.6666666667, ans=0.0 2023-09-29 06:39:03,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 06:39:03,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:03,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:04,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 06:39:06,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=280493.3333333333, ans=0.0 2023-09-29 06:39:06,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=280493.3333333333, ans=0.125 2023-09-29 06:39:08,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.08 vs. limit=12.0 2023-09-29 06:39:14,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:19,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:39:19,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:23,646 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.212e+02 2.561e+02 3.191e+02 4.940e+02, threshold=5.123e+02, percent-clipped=1.0 2023-09-29 06:39:23,688 INFO [train.py:1039] (2/4) Epoch 8, batch 4900, loss[loss=0.2112, simple_loss=0.2799, pruned_loss=0.07126, over 23417.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2846, pruned_loss=0.07583, over 4701748.08 frames. ], batch size: 119, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:39:25,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 06:39:25,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:39:27,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=280560.0, ans=0.2 2023-09-29 06:39:30,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:32,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:33,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:39:37,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 06:39:41,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 06:39:41,744 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:39:45,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 06:39:45,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=280626.6666666667, ans=0.0 2023-09-29 06:39:46,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 06:39:46,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:39:48,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:48,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:39:48,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:49,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:39:49,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 06:39:52,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 06:39:52,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:39:54,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:39:54,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:40:00,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:40:00,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:01,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:01,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 06:40:03,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:40:04,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:40:04,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 06:40:06,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 06:40:09,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 06:40:11,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:40:12,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:12,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:40:12,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:14,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:40:14,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:40:14,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 06:40:18,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:19,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:40:21,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:40:24,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 06:40:24,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:40:24,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 06:40:25,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 06:40:35,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:36,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:40:38,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 06:40:38,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:38,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:40:39,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=280826.6666666667, ans=0.1 2023-09-29 06:40:41,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:44,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:40:44,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:40:44,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:44,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:40:45,616 INFO [train.py:1039] (2/4) Epoch 8, batch 4950, loss[loss=0.2025, simple_loss=0.2346, pruned_loss=0.08525, over 19097.00 frames. ], tot_loss[loss=0.2167, simple_loss=0.2827, pruned_loss=0.07534, over 4691319.77 frames. ], batch size: 389, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:40:45,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:40:49,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:40:50,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:51,606 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=18.12 vs. limit=15.0 2023-09-29 06:40:53,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 06:40:54,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 06:40:54,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:40:55,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 06:40:55,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:55,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:55,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:40:57,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:40:58,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:58,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:41:01,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:41:02,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:41:04,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:04,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:41:04,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=280960.0, ans=0.95 2023-09-29 06:41:07,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:41:09,982 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.31 vs. limit=15.0 2023-09-29 06:41:12,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:14,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:41:17,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:17,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:18,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:41:21,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 06:41:22,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 06:41:24,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:27,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:41:27,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:41:28,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:41:28,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:41:28,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:41:29,574 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=15.0 2023-09-29 06:41:30,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:32,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:41:34,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=281093.3333333333, ans=0.09899494936611666 2023-09-29 06:41:35,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:41:39,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:39,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:39,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 06:41:40,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:41:41,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:41:45,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:41:47,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:41:47,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:41:48,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:48,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:41:50,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:41:51,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:41:51,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:41:51,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:55,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 06:41:55,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=281160.0, ans=0.1 2023-09-29 06:41:59,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:04,703 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.77 vs. limit=15.0 2023-09-29 06:42:05,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 06:42:05,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:42:07,486 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.069e+02 2.336e+02 2.676e+02 4.238e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 06:42:07,528 INFO [train.py:1039] (2/4) Epoch 8, batch 5000, loss[loss=0.2169, simple_loss=0.2735, pruned_loss=0.08014, over 23525.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2826, pruned_loss=0.07489, over 4698844.49 frames. ], batch size: 256, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:42:12,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=281226.6666666667, ans=0.125 2023-09-29 06:42:13,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:42:13,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:15,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 06:42:16,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 06:42:17,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:42:20,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 06:42:20,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:42:20,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:42:23,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 06:42:23,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:25,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:25,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 06:42:25,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:25,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:42:28,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 06:42:29,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 06:42:29,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:42:31,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 06:42:31,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:42:31,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:31,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:42:31,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 06:42:33,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 06:42:34,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 06:42:34,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:34,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:36,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 06:42:36,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:38,204 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:42:39,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:40,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:41,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:42:43,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 06:42:44,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:42:48,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:42:48,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=281360.0, ans=0.125 2023-09-29 06:42:51,415 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 06:42:54,157 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.46 vs. limit=15.0 2023-09-29 06:42:55,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:56,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:56,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:42:59,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 06:42:59,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:43:01,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:01,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:04,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 06:43:04,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:07,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:09,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:13,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 06:43:17,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:26,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:28,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:28,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:43:29,898 INFO [train.py:1039] (2/4) Epoch 8, batch 5050, loss[loss=0.1988, simple_loss=0.2834, pruned_loss=0.05709, over 24589.00 frames. ], tot_loss[loss=0.2167, simple_loss=0.2829, pruned_loss=0.0752, over 4704028.29 frames. ], batch size: 68, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:43:29,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:30,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:43:30,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:43:30,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:32,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=281560.0, ans=0.2 2023-09-29 06:43:33,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:34,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 06:43:35,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=281560.0, ans=0.02 2023-09-29 06:43:36,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:43:37,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:40,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:43:40,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 06:43:41,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:41,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:44,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:43:46,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:43:46,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:43:55,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 06:43:55,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:43:57,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:43:57,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=281626.6666666667, ans=0.0 2023-09-29 06:43:58,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 06:43:58,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:01,429 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.33 vs. limit=15.0 2023-09-29 06:44:01,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:01,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:03,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:03,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 06:44:03,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 06:44:03,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=281693.3333333333, ans=0.1 2023-09-29 06:44:03,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=281693.3333333333, ans=0.0 2023-09-29 06:44:05,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:05,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=281693.3333333333, ans=0.125 2023-09-29 06:44:08,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:11,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:11,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 06:44:14,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:16,996 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.10 vs. limit=15.0 2023-09-29 06:44:17,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 06:44:18,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:44:19,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:44:21,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:21,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:44:21,709 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.06 vs. limit=15.0 2023-09-29 06:44:22,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:44:24,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:44:26,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:26,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:44:26,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:44:26,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 06:44:28,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:44:30,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:30,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=281760.0, ans=0.0 2023-09-29 06:44:33,642 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:44:35,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:35,056 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 06:44:35,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:44:37,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:44:38,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:38,638 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 06:44:41,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:41,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 06:44:41,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:44,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:46,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:46,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 06:44:48,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 06:44:50,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:51,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:44:51,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:52,963 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.260e+02 2.527e+02 2.886e+02 4.203e+02, threshold=5.054e+02, percent-clipped=0.0 2023-09-29 06:44:53,005 INFO [train.py:1039] (2/4) Epoch 8, batch 5100, loss[loss=0.2052, simple_loss=0.2669, pruned_loss=0.07178, over 23756.00 frames. ], tot_loss[loss=0.2171, simple_loss=0.2832, pruned_loss=0.07549, over 4709723.51 frames. ], batch size: 164, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:44:53,306 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 06:44:55,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=281893.3333333333, ans=0.0 2023-09-29 06:44:56,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:59,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 06:44:59,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 06:44:59,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:00,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=281893.3333333333, ans=0.0 2023-09-29 06:45:02,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:45:06,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:45:06,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 06:45:06,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 06:45:13,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:45:13,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:45:17,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:21,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 06:45:22,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:24,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:45:24,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:45:26,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 06:45:31,481 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 06:45:32,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:33,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 06:45:33,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 06:45:34,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=282026.6666666667, ans=0.125 2023-09-29 06:45:36,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:36,361 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:45:45,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:45:49,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 06:45:49,446 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 06:45:49,469 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 06:45:51,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 06:45:52,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:55,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 06:46:00,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 06:46:02,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:46:03,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:46:06,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 06:46:07,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:46:08,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 06:46:13,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:46:13,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:46:13,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:46:14,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.30 vs. limit=22.5 2023-09-29 06:46:15,127 INFO [train.py:1039] (2/4) Epoch 8, batch 5150, loss[loss=0.2011, simple_loss=0.2787, pruned_loss=0.06181, over 24462.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2844, pruned_loss=0.07598, over 4715006.07 frames. ], batch size: 66, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:46:15,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:46:15,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:46:17,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:46:18,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 06:46:18,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 06:46:18,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 06:46:18,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:46:18,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 06:46:19,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:21,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 06:46:22,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:22,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=282226.6666666667, ans=0.2 2023-09-29 06:46:24,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:28,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:46:28,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 06:46:30,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:30,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:46:32,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:46:32,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:46:32,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:46:33,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:46:33,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:46:33,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 06:46:38,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:46:38,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:46:39,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:46:41,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 06:46:42,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:46:49,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:46:52,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 06:46:57,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:46:59,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=282360.0, ans=10.0 2023-09-29 06:47:03,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:03,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:03,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=282426.6666666667, ans=0.125 2023-09-29 06:47:06,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:07,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:11,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 06:47:16,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:47:19,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:47:19,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:47:21,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:23,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:25,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 06:47:29,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:29,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:47:31,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:31,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:47:33,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:47:33,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:47:33,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:47:35,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:47:38,004 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.091e+02 2.433e+02 2.751e+02 4.119e+02, threshold=4.867e+02, percent-clipped=0.0 2023-09-29 06:47:38,048 INFO [train.py:1039] (2/4) Epoch 8, batch 5200, loss[loss=0.2106, simple_loss=0.2862, pruned_loss=0.06753, over 24624.00 frames. ], tot_loss[loss=0.22, simple_loss=0.2859, pruned_loss=0.07706, over 4697250.59 frames. ], batch size: 68, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:47:39,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:47:41,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:47:44,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:50,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 06:47:50,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:47:51,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:54,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:56,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:47:56,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:57,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.90 vs. limit=10.0 2023-09-29 06:47:59,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 06:48:01,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:48:03,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:05,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 06:48:08,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:48:09,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:48:11,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 06:48:11,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 06:48:11,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=282693.3333333333, ans=0.125 2023-09-29 06:48:13,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 06:48:14,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:14,779 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 06:48:14,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:48:16,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:16,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:48:17,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 06:48:18,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.45 vs. limit=6.0 2023-09-29 06:48:19,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:48:21,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:21,868 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:48:25,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 06:48:25,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 06:48:25,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=282693.3333333333, ans=0.1 2023-09-29 06:48:26,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 06:48:29,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 06:48:29,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:48:37,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:48:37,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:39,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 06:48:41,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:41,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 06:48:41,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:41,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:48:44,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:45,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:48:51,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:51,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:48:51,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:55,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:58,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 06:48:58,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:58,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:48:58,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=282826.6666666667, ans=0.0 2023-09-29 06:49:00,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:01,686 INFO [train.py:1039] (2/4) Epoch 8, batch 5250, loss[loss=0.2066, simple_loss=0.2779, pruned_loss=0.06761, over 20870.00 frames. ], tot_loss[loss=0.2187, simple_loss=0.285, pruned_loss=0.07622, over 4704585.23 frames. ], batch size: 45, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:49:01,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:49:03,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:49:04,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:49:08,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:08,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:49:10,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:49:15,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=282893.3333333333, ans=0.2 2023-09-29 06:49:16,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:49:16,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=282960.0, ans=0.125 2023-09-29 06:49:18,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:49:19,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=282960.0, ans=0.125 2023-09-29 06:49:21,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:49:21,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:49:21,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=282960.0, ans=0.125 2023-09-29 06:49:24,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 06:49:24,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:26,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:31,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=282960.0, ans=0.125 2023-09-29 06:49:33,312 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=15.0 2023-09-29 06:49:47,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.55 vs. limit=10.0 2023-09-29 06:49:53,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=283093.3333333333, ans=0.125 2023-09-29 06:50:12,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=283160.0, ans=0.125 2023-09-29 06:50:16,296 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.079e+02 2.357e+02 2.633e+02 5.213e+02, threshold=4.714e+02, percent-clipped=2.0 2023-09-29 06:50:16,339 INFO [train.py:1039] (2/4) Epoch 8, batch 5300, loss[loss=0.2389, simple_loss=0.3041, pruned_loss=0.08689, over 24063.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2839, pruned_loss=0.07581, over 4702941.48 frames. ], batch size: 80, lr: 1.25e-02, grad_scale: 32.0 2023-09-29 06:50:27,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=283226.6666666667, ans=0.1 2023-09-29 06:50:29,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=283293.3333333333, ans=0.125 2023-09-29 06:50:31,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:50:31,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 06:50:31,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 06:50:31,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:32,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:32,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:32,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:32,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:32,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:50:32,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:32,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:50:33,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:50:33,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 06:50:33,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 06:50:33,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 06:50:34,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:50:34,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 06:50:34,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 06:50:34,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:34,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:34,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:35,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:35,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:50:35,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:35,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:35,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:35,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:36,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:36,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:50:36,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:36,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:50:37,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 06:50:37,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:38,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:38,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 06:50:38,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 06:50:38,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:50:38,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:50:38,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 06:50:38,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 06:50:38,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:39,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:50:39,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:39,710 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 06:50:39,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 06:50:39,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:50:40,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:40,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 06:50:40,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 06:50:40,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 06:50:41,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:50,881 INFO [train.py:1039] (2/4) Epoch 9, batch 0, loss[loss=0.2201, simple_loss=0.2861, pruned_loss=0.07705, over 23472.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2861, pruned_loss=0.07705, over 23472.00 frames. ], batch size: 106, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:50:50,881 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 06:51:04,819 INFO [train.py:1071] (2/4) Epoch 9, validation: loss=0.2824, simple_loss=0.2767, pruned_loss=0.144, over 1125622.00 frames. 2023-09-29 06:51:04,820 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 06:51:06,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 06:51:06,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:51:08,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:51:14,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:14,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:51:14,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:16,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 06:51:17,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 06:51:19,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:20,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:26,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:51:26,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:29,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 06:51:30,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:34,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=283373.3333333333, ans=0.0 2023-09-29 06:51:40,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:51:40,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:42,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 06:51:45,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:51:47,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:51:48,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:51:51,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:51:56,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:02,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 06:52:06,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 06:52:06,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:06,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:06,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:52:06,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:52:10,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 06:52:13,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:13,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:17,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:52:20,501 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 06:52:24,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:52:24,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=283573.3333333333, ans=0.1 2023-09-29 06:52:26,927 INFO [train.py:1039] (2/4) Epoch 9, batch 50, loss[loss=0.207, simple_loss=0.2822, pruned_loss=0.06591, over 24536.00 frames. ], tot_loss[loss=0.2159, simple_loss=0.2859, pruned_loss=0.07292, over 1074202.97 frames. ], batch size: 71, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:52:27,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:28,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:28,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 06:52:30,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:52:30,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:52:31,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:34,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:36,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:41,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 06:52:41,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:41,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=283706.6666666667, ans=0.1 2023-09-29 06:52:48,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:52:51,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 06:52:52,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 06:52:56,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:52:56,781 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.53 vs. limit=15.0 2023-09-29 06:52:57,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:52:57,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:59,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:53:01,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:53:01,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:53:01,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:53:10,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:12,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:12,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:53:14,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 06:53:15,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:53:17,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:53:17,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 06:53:17,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:19,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 06:53:27,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:53:27,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:28,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:30,301 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 2.133e+02 2.436e+02 2.893e+02 4.514e+02, threshold=4.872e+02, percent-clipped=0.0 2023-09-29 06:53:30,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:53:30,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:34,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 06:53:34,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 06:53:36,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:36,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:36,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=283906.6666666667, ans=0.2 2023-09-29 06:53:36,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=283906.6666666667, ans=0.125 2023-09-29 06:53:37,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:53:39,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:39,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 06:53:39,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 06:53:40,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:53:42,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:43,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:53:43,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 06:53:43,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 06:53:44,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=283906.6666666667, ans=0.04949747468305833 2023-09-29 06:53:46,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:47,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:48,935 INFO [train.py:1039] (2/4) Epoch 9, batch 100, loss[loss=0.2249, simple_loss=0.2804, pruned_loss=0.08466, over 23451.00 frames. ], tot_loss[loss=0.2154, simple_loss=0.2836, pruned_loss=0.07357, over 1877365.42 frames. ], batch size: 134, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:53:49,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:53:49,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:53:52,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:53:56,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:54:02,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:03,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 06:54:03,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:54:06,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:54:06,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:06,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:54:08,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:54:08,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:09,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 06:54:13,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:54:14,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:14,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:14,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:18,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 06:54:20,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:21,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:21,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:54:22,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=284106.6666666667, ans=0.0 2023-09-29 06:54:23,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:54:28,341 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 06:54:28,364 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 06:54:30,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:54:30,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:54:31,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=284106.6666666667, ans=0.125 2023-09-29 06:54:34,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:54:38,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:39,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:41,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=284173.3333333333, ans=0.125 2023-09-29 06:54:42,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=284173.3333333333, ans=0.125 2023-09-29 06:54:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:44,209 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 06:54:45,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:54:49,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:54:51,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:54:51,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:54,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:54:58,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:00,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:55:03,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:04,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:04,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:04,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:55:04,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:05,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=284240.0, ans=0.2 2023-09-29 06:55:06,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 06:55:06,289 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 06:55:06,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:08,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:55:08,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:08,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:10,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:55:10,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:55:11,519 INFO [train.py:1039] (2/4) Epoch 9, batch 150, loss[loss=0.185, simple_loss=0.2538, pruned_loss=0.05812, over 24328.00 frames. ], tot_loss[loss=0.2153, simple_loss=0.2835, pruned_loss=0.07356, over 2516297.95 frames. ], batch size: 56, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:55:11,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:55:11,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:11,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:13,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:14,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:55:14,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:55:16,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:21,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:21,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:23,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:26,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:26,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:30,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:55:30,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:35,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 06:55:35,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 06:55:35,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 06:55:37,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:55:37,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:55:39,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:55:41,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:41,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:41,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,976 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 06:55:45,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:51,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:56,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:55:58,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 06:56:03,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:56:03,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:56:03,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:06,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:56:08,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:56:08,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:56:10,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:11,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 06:56:16,141 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 2.032e+02 2.478e+02 3.173e+02 5.553e+02, threshold=4.955e+02, percent-clipped=3.0 2023-09-29 06:56:16,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:16,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:17,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:56:17,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:56:20,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:22,139 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.07 vs. limit=15.0 2023-09-29 06:56:23,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:56:26,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:56:26,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:56:27,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:28,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=284573.3333333333, ans=0.125 2023-09-29 06:56:29,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:56:29,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 06:56:30,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:30,759 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 06:56:31,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=284573.3333333333, ans=0.2 2023-09-29 06:56:34,636 INFO [train.py:1039] (2/4) Epoch 9, batch 200, loss[loss=0.2435, simple_loss=0.2997, pruned_loss=0.09368, over 22723.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.2853, pruned_loss=0.07509, over 3005953.17 frames. ], batch size: 322, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:56:36,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:39,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:56:39,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:56:42,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 06:56:44,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:44,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:46,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=284640.0, ans=0.04949747468305833 2023-09-29 06:56:47,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 06:56:48,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=284640.0, ans=0.125 2023-09-29 06:56:49,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:56:50,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:52,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:54,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=284706.6666666667, ans=0.0 2023-09-29 06:56:56,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:56:57,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:57,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:14,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:57:14,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:57:16,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:57:17,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:57:19,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:57:19,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:57:20,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:22,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:57:23,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:23,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:25,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 06:57:25,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:57:25,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:27,977 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.50 vs. limit=15.0 2023-09-29 06:57:28,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:57:31,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=284840.0, ans=0.0 2023-09-29 06:57:33,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=284840.0, ans=0.0 2023-09-29 06:57:36,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:44,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:44,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:57:52,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:55,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 06:57:55,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:56,919 INFO [train.py:1039] (2/4) Epoch 9, batch 250, loss[loss=0.2218, simple_loss=0.2787, pruned_loss=0.08239, over 23741.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2853, pruned_loss=0.07548, over 3362643.75 frames. ], batch size: 164, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:57:56,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:57:56,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:58,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:57:58,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 06:58:00,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:00,210 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 06:58:01,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:05,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:58:06,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:06,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:58:08,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:58:09,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:11,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:58:14,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:58:16,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=285040.0, ans=0.125 2023-09-29 06:58:25,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:28,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:58:28,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:58:34,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:58:36,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:58:36,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:58:37,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:37,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=285106.6666666667, ans=0.0 2023-09-29 06:58:39,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:58:39,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:58:39,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:41,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:58:42,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 06:58:42,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:45,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:58:46,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:58:46,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:58:46,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:58:48,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:58:48,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:58:50,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:53,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:58:53,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:58:58,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:59:01,249 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.006e+02 2.267e+02 2.549e+02 3.617e+02, threshold=4.534e+02, percent-clipped=0.0 2023-09-29 06:59:04,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:04,733 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:07,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:59:09,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:12,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:12,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:59:16,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 06:59:18,291 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.36 vs. limit=15.0 2023-09-29 06:59:18,750 INFO [train.py:1039] (2/4) Epoch 9, batch 300, loss[loss=0.2127, simple_loss=0.2778, pruned_loss=0.07383, over 23659.00 frames. ], tot_loss[loss=0.2167, simple_loss=0.2833, pruned_loss=0.07502, over 3675324.31 frames. ], batch size: 120, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:59:18,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:59:18,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:59:21,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 06:59:22,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:59:22,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:59:22,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 06:59:22,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=285306.6666666667, ans=0.125 2023-09-29 06:59:27,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:29,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:59:34,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:59:34,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 06:59:36,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:37,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:59:37,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 06:59:37,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:37,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=285373.3333333333, ans=0.0 2023-09-29 06:59:40,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:59:46,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:59:46,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 06:59:50,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 06:59:52,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:53,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:55,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:55,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 06:59:55,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:59:57,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:00:00,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:00:02,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:07,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:00:07,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 07:00:07,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:00:09,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=285506.6666666667, ans=0.125 2023-09-29 07:00:11,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:12,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 07:00:14,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:19,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:00:22,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:00:22,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 07:00:22,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=285506.6666666667, ans=0.025 2023-09-29 07:00:23,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=285506.6666666667, ans=0.125 2023-09-29 07:00:25,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:25,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:00:27,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:29,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:00:30,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 07:00:30,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:00:32,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:32,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 07:00:35,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:35,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:35,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:35,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=285573.3333333333, ans=0.0 2023-09-29 07:00:37,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:37,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:41,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=285640.0, ans=0.125 2023-09-29 07:00:42,579 INFO [train.py:1039] (2/4) Epoch 9, batch 350, loss[loss=0.221, simple_loss=0.2782, pruned_loss=0.08191, over 23397.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.2815, pruned_loss=0.0744, over 3900415.07 frames. ], batch size: 285, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:00:44,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:00:44,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:00:44,776 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-09-29 07:00:47,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:52,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=285640.0, ans=0.1 2023-09-29 07:00:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:55,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:56,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:58,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 07:01:00,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:00,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 07:01:02,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:03,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 07:01:05,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:08,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 07:01:08,419 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:01:09,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:01:12,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:14,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:01:15,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:15,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:15,259 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:01:16,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:16,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:16,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:01:16,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=285773.3333333333, ans=0.0 2023-09-29 07:01:19,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:01:19,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:27,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:01:27,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:01:27,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:01:29,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:35,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 07:01:35,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:39,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=285840.0, ans=0.125 2023-09-29 07:01:40,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:40,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:01:40,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:42,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 07:01:45,354 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.981e+02 2.324e+02 2.780e+02 5.402e+02, threshold=4.648e+02, percent-clipped=1.0 2023-09-29 07:01:45,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:46,989 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 07:01:47,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 07:01:47,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:49,776 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.20 vs. limit=6.0 2023-09-29 07:01:51,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:51,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 07:01:54,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:58,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:01:59,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:59,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=285906.6666666667, ans=0.125 2023-09-29 07:02:00,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:01,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:03,922 INFO [train.py:1039] (2/4) Epoch 9, batch 400, loss[loss=0.2032, simple_loss=0.2787, pruned_loss=0.06387, over 24027.00 frames. ], tot_loss[loss=0.2145, simple_loss=0.2814, pruned_loss=0.07383, over 4092869.05 frames. ], batch size: 86, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:02:04,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:07,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:02:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:02:11,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 07:02:11,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:11,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:13,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:02:13,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:14,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=285973.3333333333, ans=0.2 2023-09-29 07:02:16,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:18,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:20,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 07:02:21,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 07:02:21,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:23,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 07:02:25,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:02:28,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:28,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 07:02:28,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:02:28,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:28,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=286040.0, ans=0.125 2023-09-29 07:02:29,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:31,553 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 07:02:31,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 07:02:36,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:38,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:38,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 07:02:39,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 07:02:44,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:02:46,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:02:52,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 07:02:53,273 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.36 vs. limit=12.0 2023-09-29 07:02:58,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:02:59,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 07:03:01,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:03:02,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:03:02,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 07:03:06,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:03:08,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.26 vs. limit=15.0 2023-09-29 07:03:09,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:03:11,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:03:14,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:16,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 07:03:19,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:03:19,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 07:03:20,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:03:20,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:03:22,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 07:03:22,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=286240.0, ans=0.125 2023-09-29 07:03:25,681 INFO [train.py:1039] (2/4) Epoch 9, batch 450, loss[loss=0.2133, simple_loss=0.2923, pruned_loss=0.06712, over 24624.00 frames. ], tot_loss[loss=0.2148, simple_loss=0.2815, pruned_loss=0.07407, over 4222162.42 frames. ], batch size: 68, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:03:25,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:03:25,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:03:25,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:03:29,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 07:03:29,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:03:31,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:03:32,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:03:32,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 07:03:32,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:03:34,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:03:36,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:03:44,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:46,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:03:47,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 07:03:48,613 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.54 vs. limit=15.0 2023-09-29 07:03:49,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 07:03:51,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:03:53,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=286373.3333333333, ans=0.1 2023-09-29 07:03:53,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=286373.3333333333, ans=0.125 2023-09-29 07:03:54,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:56,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:03:59,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:03:59,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:04:00,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=286440.0, ans=0.0 2023-09-29 07:04:02,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 07:04:03,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 07:04:05,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 07:04:05,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:07,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:07,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:04:10,818 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 07:04:10,834 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 07:04:12,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:04:13,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:04:15,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:04:20,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:04:20,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:04:21,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:04:23,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 07:04:26,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:29,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:04:29,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:04:31,300 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.979e+02 2.168e+02 2.458e+02 3.361e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 07:04:31,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 07:04:33,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=286573.3333333333, ans=0.0 2023-09-29 07:04:36,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:04:36,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 07:04:38,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 07:04:39,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:43,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:04:46,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:04:48,036 INFO [train.py:1039] (2/4) Epoch 9, batch 500, loss[loss=0.2488, simple_loss=0.2999, pruned_loss=0.09885, over 22719.00 frames. ], tot_loss[loss=0.2144, simple_loss=0.2818, pruned_loss=0.07351, over 4341373.92 frames. ], batch size: 322, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:04:48,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:04:48,219 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 07:04:51,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:52,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:04:52,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:52,970 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 07:04:55,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 07:04:55,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:59,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:05:03,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:05:04,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:05:07,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:05:07,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:05:07,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:14,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=286706.6666666667, ans=0.0 2023-09-29 07:05:17,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:19,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:05:19,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:05:20,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:20,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 07:05:20,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:05:23,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.11 vs. limit=22.5 2023-09-29 07:05:24,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:05:26,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:05:26,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:05:26,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:28,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 07:05:31,368 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 07:05:34,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:34,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:36,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:37,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:38,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:05:40,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 07:05:42,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:05:44,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:46,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:05:49,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:54,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:58,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 07:05:58,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:58,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:06:01,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 07:06:01,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:06:04,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:09,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 07:06:10,900 INFO [train.py:1039] (2/4) Epoch 9, batch 550, loss[loss=0.219, simple_loss=0.2911, pruned_loss=0.07345, over 23429.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2833, pruned_loss=0.07456, over 4414062.18 frames. ], batch size: 93, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:06:12,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 07:06:12,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:13,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 07:06:15,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:06:15,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:16,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=286973.3333333333, ans=0.125 2023-09-29 07:06:17,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:06:18,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:06:20,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:21,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 07:06:21,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:06:27,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:27,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:30,406 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.11 vs. limit=15.0 2023-09-29 07:06:31,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:06:31,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:36,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 07:06:38,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 07:06:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:06:43,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:06:44,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:46,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:06:48,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:48,647 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 07:06:51,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:53,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:06:57,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:57,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:06:57,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:06:59,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:00,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=287173.3333333333, ans=0.0 2023-09-29 07:07:01,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 07:07:02,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 07:07:03,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:03,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:07:04,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:07:04,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:07:06,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:07:10,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:07:11,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:07:11,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:13,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 07:07:14,834 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.039e+02 2.212e+02 2.496e+02 3.392e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 07:07:14,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:07:17,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:17,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:07:18,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:20,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:07:20,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:07:28,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 07:07:29,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=287306.6666666667, ans=0.125 2023-09-29 07:07:31,076 INFO [train.py:1039] (2/4) Epoch 9, batch 600, loss[loss=0.226, simple_loss=0.3022, pruned_loss=0.07492, over 23643.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2836, pruned_loss=0.07429, over 4483308.83 frames. ], batch size: 85, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:07:31,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 07:07:34,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:07:34,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:07:34,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:41,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:07:42,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:07:45,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 07:07:47,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:07:48,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=287373.3333333333, ans=0.1 2023-09-29 07:07:50,157 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.72 vs. limit=15.0 2023-09-29 07:07:50,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:07:52,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:54,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 07:07:54,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:08:01,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 07:08:05,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:08:05,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:07,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:08:11,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:08:11,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:08:13,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:08:25,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:25,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:08:25,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:33,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 07:08:38,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:08:38,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:08:42,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 07:08:44,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:08:46,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 07:08:46,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:08:46,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:08:48,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=287573.3333333333, ans=0.125 2023-09-29 07:08:50,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=287573.3333333333, ans=0.09899494936611666 2023-09-29 07:08:52,981 INFO [train.py:1039] (2/4) Epoch 9, batch 650, loss[loss=0.2095, simple_loss=0.2494, pruned_loss=0.08476, over 19161.00 frames. ], tot_loss[loss=0.2143, simple_loss=0.2815, pruned_loss=0.07357, over 4524013.91 frames. ], batch size: 388, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:08:53,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:08:55,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:08:58,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:08:58,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:08:59,641 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.05 vs. limit=15.0 2023-09-29 07:09:00,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:04,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 07:09:05,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:09:11,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:09:11,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:14,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:19,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 07:09:20,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:21,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:22,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=287706.6666666667, ans=0.0 2023-09-29 07:09:24,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:09:25,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.40 vs. limit=15.0 2023-09-29 07:09:26,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:09:26,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=287773.3333333333, ans=0.125 2023-09-29 07:09:29,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:29,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:31,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:09:32,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:34,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:09:35,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:09:36,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 07:09:36,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:36,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:37,025 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.93 vs. limit=10.0 2023-09-29 07:09:39,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:42,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:42,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:09:42,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:09:44,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 07:09:45,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:09:45,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:09:47,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:09:47,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:47,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:09:49,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 07:09:50,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 07:09:50,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:50,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:50,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:09:50,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:53,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:58,843 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.023e+02 2.249e+02 2.557e+02 3.525e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 07:10:01,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:01,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:03,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:10:05,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=287906.6666666667, ans=0.125 2023-09-29 07:10:06,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:07,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:10:07,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:15,912 INFO [train.py:1039] (2/4) Epoch 9, batch 700, loss[loss=0.2178, simple_loss=0.2985, pruned_loss=0.06859, over 24550.00 frames. ], tot_loss[loss=0.2142, simple_loss=0.2809, pruned_loss=0.07376, over 4562674.68 frames. ], batch size: 71, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:10:15,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:10:15,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:16,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:16,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:20,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 07:10:22,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 07:10:25,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 07:10:25,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:28,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:10:30,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 07:10:33,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:38,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:10:39,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:39,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:10:41,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:44,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:46,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=288040.0, ans=0.0 2023-09-29 07:10:47,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:10:47,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:10:49,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 07:10:52,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 07:10:55,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:10:57,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:10:58,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:11:03,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:11:05,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 07:11:10,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:10,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:11:10,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 07:11:13,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=288173.3333333333, ans=10.0 2023-09-29 07:11:15,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:11:15,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:19,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:11:25,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:11:25,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 07:11:29,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 07:11:29,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 07:11:30,651 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=13.76 vs. limit=15.0 2023-09-29 07:11:31,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:33,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:34,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:11:35,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=288240.0, ans=0.125 2023-09-29 07:11:36,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:36,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 07:11:38,243 INFO [train.py:1039] (2/4) Epoch 9, batch 750, loss[loss=0.2131, simple_loss=0.2551, pruned_loss=0.08551, over 19434.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2802, pruned_loss=0.07269, over 4607946.50 frames. ], batch size: 388, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:11:41,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 07:11:41,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 07:11:41,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 07:11:43,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 07:11:43,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 07:11:45,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:11:46,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 07:11:48,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:11:51,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:11:52,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:52,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:11:52,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:55,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:11:57,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:11:57,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=288373.3333333333, ans=0.0 2023-09-29 07:11:58,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:11:59,263 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:12:00,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:02,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:02,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 07:12:02,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.53 vs. limit=15.0 2023-09-29 07:12:03,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:12:03,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:05,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:09,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:12:09,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 07:12:09,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:12:12,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 07:12:12,476 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 07:12:13,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 07:12:14,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:12:14,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:12:16,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:12:24,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:12:24,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:24,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:12:24,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=288440.0, ans=0.125 2023-09-29 07:12:27,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:27,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=288506.6666666667, ans=0.0 2023-09-29 07:12:29,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:12:29,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 07:12:29,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:12:30,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 07:12:32,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:12:36,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:12:36,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 07:12:38,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:40,122 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:12:44,797 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.964e+02 2.224e+02 2.470e+02 4.454e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 07:12:44,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:12:45,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:12:46,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:48,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:12:48,999 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-09-29 07:12:52,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 07:12:53,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:12:53,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:01,766 INFO [train.py:1039] (2/4) Epoch 9, batch 800, loss[loss=0.2248, simple_loss=0.305, pruned_loss=0.07233, over 24429.00 frames. ], tot_loss[loss=0.2134, simple_loss=0.2809, pruned_loss=0.07297, over 4635306.64 frames. ], batch size: 69, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:13:01,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:01,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:13:11,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:11,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:12,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:13:12,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:14,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:14,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:15,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:18,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.47 vs. limit=15.0 2023-09-29 07:13:19,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:21,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:13:24,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 07:13:24,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:25,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:25,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:13:25,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:25,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 07:13:27,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:28,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 07:13:31,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:31,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=288706.6666666667, ans=0.04949747468305833 2023-09-29 07:13:35,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:38,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:13:38,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:39,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:39,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:43,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=288773.3333333333, ans=0.2 2023-09-29 07:13:44,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:13:44,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:13:45,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 07:13:47,557 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 07:13:47,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 07:13:47,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:13:47,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:50,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:50,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:13:50,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=288840.0, ans=0.0 2023-09-29 07:13:55,995 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 07:13:56,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 07:13:57,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:13:59,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:13:59,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=288840.0, ans=0.2 2023-09-29 07:14:01,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:14:07,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:07,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 07:14:08,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:14:09,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.45 vs. limit=15.0 2023-09-29 07:14:13,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 07:14:21,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:22,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:14:24,354 INFO [train.py:1039] (2/4) Epoch 9, batch 850, loss[loss=0.233, simple_loss=0.2872, pruned_loss=0.08941, over 23597.00 frames. ], tot_loss[loss=0.2142, simple_loss=0.282, pruned_loss=0.07322, over 4653799.94 frames. ], batch size: 256, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:14:24,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 07:14:24,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:14:24,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:26,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 07:14:27,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:29,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:14:32,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:34,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:14:35,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:14:37,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 07:14:37,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 07:14:37,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 07:14:39,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:39,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:14:41,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:43,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:43,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:14:48,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:48,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:48,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 07:14:51,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 07:14:53,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.92 vs. limit=15.0 2023-09-29 07:14:56,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:56,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 07:14:59,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 07:15:00,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 07:15:02,481 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 07:15:04,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:04,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:15:04,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:15:06,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:07,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:09,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 07:15:10,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:12,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:12,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:15:12,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:15:14,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:15:16,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:15:18,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 07:15:19,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=289173.3333333333, ans=0.125 2023-09-29 07:15:22,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:15:22,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:24,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:15:24,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:25,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:28,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:30,142 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 2.042e+02 2.331e+02 2.770e+02 4.715e+02, threshold=4.662e+02, percent-clipped=1.0 2023-09-29 07:15:30,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:15:31,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:15:31,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:33,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:15:40,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:15:40,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=289240.0, ans=0.125 2023-09-29 07:15:41,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:42,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 07:15:42,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:15:42,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:45,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 07:15:47,709 INFO [train.py:1039] (2/4) Epoch 9, batch 900, loss[loss=0.2901, simple_loss=0.3331, pruned_loss=0.1235, over 19524.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2839, pruned_loss=0.07447, over 4653403.09 frames. ], batch size: 388, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:15:53,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:15:57,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:57,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 07:16:00,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:16:02,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 07:16:03,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:16:04,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:16:04,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:04,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:16:05,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:16:16,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:17,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:16:17,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:16:20,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:25,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 07:16:27,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:16:31,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.93 vs. limit=10.0 2023-09-29 07:16:32,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:16:32,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:16:32,449 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 07:16:33,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 07:16:41,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:16:41,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:16:41,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:16:49,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:49,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:16:51,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 07:16:51,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:55,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 07:16:57,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:16:57,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:59,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:00,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:02,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 07:17:02,791 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 07:17:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:17:05,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 07:17:07,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:10,224 INFO [train.py:1039] (2/4) Epoch 9, batch 950, loss[loss=0.2082, simple_loss=0.2614, pruned_loss=0.07745, over 23533.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2851, pruned_loss=0.07559, over 4643994.64 frames. ], batch size: 256, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:17:11,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 07:17:16,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:20,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:17:24,760 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 07:17:28,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:29,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:31,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:31,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:17:32,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 07:17:33,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:17:35,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:37,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 07:17:37,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:41,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:41,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:41,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:43,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 07:17:43,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 07:17:45,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:46,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:17:52,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:52,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:56,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 07:17:58,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:17:58,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:18:00,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:00,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:00,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:18:07,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 07:18:07,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:18:09,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=289840.0, ans=0.0 2023-09-29 07:18:10,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:12,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:12,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 07:18:12,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:12,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:18:13,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 07:18:15,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=289906.6666666667, ans=0.125 2023-09-29 07:18:16,759 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 1.953e+02 2.303e+02 2.846e+02 4.844e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 07:18:16,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:18:20,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:24,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:26,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 07:18:26,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 07:18:31,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:32,661 INFO [train.py:1039] (2/4) Epoch 9, batch 1000, loss[loss=0.2142, simple_loss=0.2791, pruned_loss=0.07471, over 23422.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2831, pruned_loss=0.07462, over 4650735.37 frames. ], batch size: 120, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:18:36,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 07:18:37,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:18:43,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:18:45,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 07:18:45,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 07:18:49,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:18:49,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:51,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:53,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=290040.0, ans=0.125 2023-09-29 07:18:54,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 07:18:57,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 07:18:57,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 07:18:59,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:00,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 07:19:02,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 07:19:02,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 07:19:02,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:04,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:05,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=290106.6666666667, ans=0.125 2023-09-29 07:19:15,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:16,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:19:16,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:18,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:18,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 07:19:18,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:20,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:19:20,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:21,857 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 07:19:24,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 07:19:25,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 07:19:28,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 07:19:29,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:19:30,644 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.47 vs. limit=15.0 2023-09-29 07:19:33,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=290173.3333333333, ans=0.125 2023-09-29 07:19:34,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:34,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:19:34,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:38,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:19:39,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 07:19:42,558 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=15.0 2023-09-29 07:19:43,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:19:43,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 07:19:43,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 07:19:46,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:19:46,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:48,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:19:51,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:19:53,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:56,371 INFO [train.py:1039] (2/4) Epoch 9, batch 1050, loss[loss=0.2062, simple_loss=0.2851, pruned_loss=0.06362, over 24465.00 frames. ], tot_loss[loss=0.2145, simple_loss=0.2815, pruned_loss=0.07373, over 4665563.26 frames. ], batch size: 66, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:19:56,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:19:58,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:20:01,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:20:01,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:01,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=290306.6666666667, ans=0.0 2023-09-29 07:20:02,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:05,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:20:05,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:20:08,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.94 vs. limit=15.0 2023-09-29 07:20:09,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:20:10,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:20:10,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:20:12,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:20:13,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 07:20:14,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:14,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 07:20:17,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:20:17,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 07:20:17,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:20:24,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:24,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:20:26,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:27,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 07:20:27,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 07:20:29,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:31,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=290440.0, ans=0.125 2023-09-29 07:20:32,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 07:20:36,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 07:20:36,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:20:41,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:20:43,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:20:43,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:20:43,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:20:45,557 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.85 vs. limit=15.0 2023-09-29 07:20:48,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:20:53,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 07:20:54,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 07:20:56,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 07:20:56,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:20:56,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:20:57,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 07:21:00,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=290506.6666666667, ans=0.125 2023-09-29 07:21:01,883 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=15.0 2023-09-29 07:21:02,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:21:04,054 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.054e+02 2.289e+02 2.734e+02 4.286e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 07:21:04,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:21:04,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:05,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:05,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:10,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:10,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 07:21:12,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:12,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 07:21:12,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 07:21:12,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=290573.3333333333, ans=0.2 2023-09-29 07:21:13,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:21:16,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:21:18,251 INFO [train.py:1039] (2/4) Epoch 9, batch 1100, loss[loss=0.2247, simple_loss=0.2835, pruned_loss=0.08295, over 23751.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.2808, pruned_loss=0.07313, over 4684148.57 frames. ], batch size: 164, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:21:18,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=290640.0, ans=0.1 2023-09-29 07:21:23,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:21:29,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:21:32,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:21:32,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:32,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=290640.0, ans=0.2 2023-09-29 07:21:33,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 07:21:33,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:21:36,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:21:40,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:21:43,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:21:43,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 07:21:45,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:21:45,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:45,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:48,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:21:50,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:21:54,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:21:57,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=290773.3333333333, ans=0.2 2023-09-29 07:21:58,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 07:21:59,969 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 07:22:00,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:04,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:05,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:22:06,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:22:07,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 07:22:08,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:22:08,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:22:08,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:22:10,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:10,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 07:22:11,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=290840.0, ans=0.125 2023-09-29 07:22:11,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=290840.0, ans=0.125 2023-09-29 07:22:17,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:22:17,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 07:22:19,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:22:24,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:22:27,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 07:22:27,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:22:29,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:32,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:32,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:34,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 07:22:35,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:22:37,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:37,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 07:22:40,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:22:40,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 07:22:41,379 INFO [train.py:1039] (2/4) Epoch 9, batch 1150, loss[loss=0.2829, simple_loss=0.3252, pruned_loss=0.1203, over 19386.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2813, pruned_loss=0.07304, over 4694318.03 frames. ], batch size: 388, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:22:41,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:22:41,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:22:41,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:22:41,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=290973.3333333333, ans=0.125 2023-09-29 07:22:42,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.99 vs. limit=15.0 2023-09-29 07:22:48,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:49,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:22:52,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:52,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:22:52,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 07:22:52,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:22:56,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 07:22:56,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:57,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:23:00,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.95 vs. limit=15.0 2023-09-29 07:23:03,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 07:23:05,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:10,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:23:10,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:10,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 07:23:10,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:23:10,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:23:16,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=291106.6666666667, ans=0.125 2023-09-29 07:23:17,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 07:23:18,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:20,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:23:23,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=291106.6666666667, ans=0.0 2023-09-29 07:23:30,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:36,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:38,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 07:23:38,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:38,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:45,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 07:23:47,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:48,998 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.090e+02 2.381e+02 2.869e+02 4.983e+02, threshold=4.763e+02, percent-clipped=2.0 2023-09-29 07:23:53,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=291240.0, ans=0.1 2023-09-29 07:23:54,736 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 07:23:57,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:23:59,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:23:59,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:24:00,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:24:02,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:03,993 INFO [train.py:1039] (2/4) Epoch 9, batch 1200, loss[loss=0.2158, simple_loss=0.2817, pruned_loss=0.07495, over 23294.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2816, pruned_loss=0.07288, over 4705192.65 frames. ], batch size: 105, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:24:05,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=291306.6666666667, ans=0.125 2023-09-29 07:24:07,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:24:07,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:24:10,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:10,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:10,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:24:11,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:24:15,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:24:15,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:16,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:20,152 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 07:24:24,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 07:24:26,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:24:29,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:24:31,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:31,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.46 vs. limit=22.5 2023-09-29 07:24:32,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:24:32,861 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 07:24:34,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:37,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=291440.0, ans=0.0 2023-09-29 07:24:43,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:24:43,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:24:43,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 07:24:44,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:24:50,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 07:24:53,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 07:24:54,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:54,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:58,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:24:58,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:25:00,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:25:00,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:25:02,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:25:02,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 07:25:02,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:25:02,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:02,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:25:05,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:05,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:25:10,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:25:13,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:25:13,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=291573.3333333333, ans=0.125 2023-09-29 07:25:16,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=291573.3333333333, ans=0.0 2023-09-29 07:25:17,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 07:25:19,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=291573.3333333333, ans=0.05 2023-09-29 07:25:20,912 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 07:25:21,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=291573.3333333333, ans=0.125 2023-09-29 07:25:22,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:26,015 INFO [train.py:1039] (2/4) Epoch 9, batch 1250, loss[loss=0.1902, simple_loss=0.2743, pruned_loss=0.05309, over 24312.00 frames. ], tot_loss[loss=0.2141, simple_loss=0.2821, pruned_loss=0.07301, over 4710935.36 frames. ], batch size: 74, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:25:26,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:27,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:25:28,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=291640.0, ans=0.125 2023-09-29 07:25:29,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:32,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 07:25:37,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:25:38,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:39,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 07:25:41,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:25:42,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:25:47,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:25:47,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:49,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:25:49,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:51,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:25:53,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=291706.6666666667, ans=0.125 2023-09-29 07:25:55,267 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.28 vs. limit=22.5 2023-09-29 07:25:56,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:25:56,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:25:56,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:57,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:57,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:03,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:03,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:26:10,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 07:26:10,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:26:13,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:14,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 07:26:15,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:26:15,567 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 07:26:15,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:15,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:20,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:20,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:21,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:26:24,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 07:26:24,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 07:26:24,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 07:26:25,472 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.70 vs. limit=15.0 2023-09-29 07:26:27,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:29,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 07:26:29,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:31,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:26:31,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:26:33,088 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.977e+02 2.179e+02 2.388e+02 3.416e+02, threshold=4.359e+02, percent-clipped=0.0 2023-09-29 07:26:33,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 07:26:33,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:26:33,624 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:26:34,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:26:34,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:26:34,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:35,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=291906.6666666667, ans=0.1 2023-09-29 07:26:36,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 07:26:39,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:42,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:26:44,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:26:45,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:26:48,732 INFO [train.py:1039] (2/4) Epoch 9, batch 1300, loss[loss=0.2178, simple_loss=0.2984, pruned_loss=0.06855, over 23933.00 frames. ], tot_loss[loss=0.2148, simple_loss=0.2828, pruned_loss=0.07341, over 4708994.19 frames. ], batch size: 86, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:26:48,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:48,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 07:26:53,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:55,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:26:56,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:26:58,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:59,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:26:59,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 07:27:05,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:27:06,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:27:08,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 07:27:13,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:27:16,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:16,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:19,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:27:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:23,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:27:23,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:27:23,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 07:27:31,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:27:31,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:27:31,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=292106.6666666667, ans=0.125 2023-09-29 07:27:32,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 07:27:34,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:27:35,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:27:37,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:27:38,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 07:27:41,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:41,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 07:27:41,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:44,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:44,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:27:48,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 07:27:49,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 07:27:51,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 07:27:55,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=292240.0, ans=0.125 2023-09-29 07:27:56,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:27:59,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 07:28:01,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:02,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=292240.0, ans=0.07 2023-09-29 07:28:10,009 INFO [train.py:1039] (2/4) Epoch 9, batch 1350, loss[loss=0.1853, simple_loss=0.2571, pruned_loss=0.05678, over 24576.00 frames. ], tot_loss[loss=0.214, simple_loss=0.2817, pruned_loss=0.07315, over 4708640.41 frames. ], batch size: 60, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:28:10,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 07:28:13,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:14,880 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.56 vs. limit=12.0 2023-09-29 07:28:16,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:21,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:21,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:24,313 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.08 vs. limit=22.5 2023-09-29 07:28:25,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:28:25,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:29,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:31,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 07:28:33,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:28:33,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:28:36,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 07:28:37,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:28:39,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:28:39,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 07:28:41,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 07:28:42,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 07:28:44,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:44,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 07:28:46,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=292440.0, ans=0.0 2023-09-29 07:28:56,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:04,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:04,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:06,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 07:29:10,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:10,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 07:29:10,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=292506.6666666667, ans=0.125 2023-09-29 07:29:11,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:29:13,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:29:14,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:29:18,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 07:29:19,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:29:21,320 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.985e+02 2.227e+02 2.562e+02 4.004e+02, threshold=4.454e+02, percent-clipped=0.0 2023-09-29 07:29:25,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 07:29:28,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 07:29:33,068 INFO [train.py:1039] (2/4) Epoch 9, batch 1400, loss[loss=0.188, simple_loss=0.2687, pruned_loss=0.05362, over 24655.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.2807, pruned_loss=0.07316, over 4715612.55 frames. ], batch size: 65, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:29:33,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=292640.0, ans=0.0 2023-09-29 07:29:34,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 07:29:36,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:39,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:29:39,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:29:40,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=292640.0, ans=0.125 2023-09-29 07:29:46,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 07:29:46,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 07:29:46,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=292640.0, ans=0.1 2023-09-29 07:29:49,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=292706.6666666667, ans=0.07 2023-09-29 07:29:56,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:29:58,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:00,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:30:01,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:30:04,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:30:07,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:30:08,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=292773.3333333333, ans=0.025 2023-09-29 07:30:08,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=292773.3333333333, ans=0.125 2023-09-29 07:30:12,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=292773.3333333333, ans=0.0 2023-09-29 07:30:16,158 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.35 vs. limit=22.5 2023-09-29 07:30:16,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:18,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:21,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 07:30:22,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:30:23,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:30:24,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:30:24,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:26,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:30:26,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:30:26,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:30:28,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 07:30:28,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:30:32,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:35,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:30:36,105 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.07 vs. limit=15.0 2023-09-29 07:30:37,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=292840.0, ans=0.0 2023-09-29 07:30:42,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 07:30:43,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:30:43,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:30:45,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=292906.6666666667, ans=0.125 2023-09-29 07:30:46,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:30:48,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:30:50,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:30:53,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:30:56,704 INFO [train.py:1039] (2/4) Epoch 9, batch 1450, loss[loss=0.21, simple_loss=0.2897, pruned_loss=0.06515, over 24638.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.2791, pruned_loss=0.07301, over 4702873.63 frames. ], batch size: 68, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:30:58,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:30:58,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:58,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:31:02,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=292973.3333333333, ans=15.0 2023-09-29 07:31:03,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:05,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:31:05,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=292973.3333333333, ans=0.125 2023-09-29 07:31:08,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:31:08,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 07:31:09,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:31:09,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 07:31:11,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:11,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:11,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 07:31:13,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:14,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:31:16,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 07:31:16,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:16,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:31:19,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:22,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:25,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:31:25,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:31:29,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:29,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:32,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:31:32,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:36,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 07:31:39,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:44,375 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 07:31:45,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:31:48,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:31:49,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:31:49,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 07:31:54,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:55,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 07:31:55,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=293173.3333333333, ans=0.125 2023-09-29 07:31:57,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 07:31:57,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:00,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:00,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:32:02,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 07:32:05,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 07:32:05,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 07:32:07,319 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.945e+02 2.236e+02 2.452e+02 4.458e+02, threshold=4.473e+02, percent-clipped=1.0 2023-09-29 07:32:07,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:09,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:32:11,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=293240.0, ans=0.2 2023-09-29 07:32:19,972 INFO [train.py:1039] (2/4) Epoch 9, batch 1500, loss[loss=0.2869, simple_loss=0.3292, pruned_loss=0.1223, over 19589.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2802, pruned_loss=0.0736, over 4704341.49 frames. ], batch size: 388, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:32:23,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 07:32:23,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:32:23,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:32:23,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=293306.6666666667, ans=0.0 2023-09-29 07:32:24,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:24,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:24,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=293306.6666666667, ans=0.125 2023-09-29 07:32:30,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:32:30,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 07:32:32,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:32:33,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:32:33,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:33,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:36,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:32:38,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:43,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:44,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 07:32:44,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:32:46,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:32:46,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:47,056 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.49 vs. limit=15.0 2023-09-29 07:32:47,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.90 vs. limit=15.0 2023-09-29 07:32:49,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 07:32:54,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 07:32:57,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:59,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 07:33:03,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:33:06,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:06,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:33:07,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:07,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 07:33:07,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:33:09,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:09,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 07:33:10,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:14,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:33:14,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 07:33:19,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:33:22,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:33:25,377 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.24 vs. limit=15.0 2023-09-29 07:33:27,642 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 07:33:28,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:28,372 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 07:33:28,868 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.02 vs. limit=15.0 2023-09-29 07:33:29,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:30,651 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.31 vs. limit=22.5 2023-09-29 07:33:31,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:33:31,434 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 07:33:31,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:33:35,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 07:33:38,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:41,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=293573.3333333333, ans=10.0 2023-09-29 07:33:42,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:42,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:44,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:45,483 INFO [train.py:1039] (2/4) Epoch 9, batch 1550, loss[loss=0.2041, simple_loss=0.2885, pruned_loss=0.0598, over 24450.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.2806, pruned_loss=0.07299, over 4713528.07 frames. ], batch size: 69, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:33:47,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 07:33:47,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 07:33:48,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:33:48,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 07:33:48,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 07:33:50,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:52,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:52,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:33:52,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:33:54,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:54,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:56,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.70 vs. limit=15.0 2023-09-29 07:33:57,954 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 07:33:59,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:59,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:34:00,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:34:02,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:34:02,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 07:34:04,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:34:04,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 07:34:06,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 07:34:06,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 07:34:07,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:09,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:12,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:34:13,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.22 vs. limit=12.0 2023-09-29 07:34:15,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 07:34:15,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 07:34:22,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:26,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:34:28,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:34:28,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:34:28,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 07:34:33,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:34:35,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:38,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:34:40,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:34:40,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=293840.0, ans=0.1 2023-09-29 07:34:41,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:41,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 07:34:41,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:34:43,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:34:43,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:45,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:34:45,398 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 07:34:48,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:34:48,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=293840.0, ans=0.0 2023-09-29 07:34:55,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 07:34:56,634 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.971e+02 2.191e+02 2.523e+02 4.378e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 07:34:58,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:00,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:01,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 07:35:01,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=293906.6666666667, ans=0.0 2023-09-29 07:35:03,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:35:04,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:04,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:35:04,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:35:06,220 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.48 vs. limit=6.0 2023-09-29 07:35:06,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:35:07,257 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.76 vs. limit=15.0 2023-09-29 07:35:08,742 INFO [train.py:1039] (2/4) Epoch 9, batch 1600, loss[loss=0.2184, simple_loss=0.2962, pruned_loss=0.07028, over 23944.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.2819, pruned_loss=0.07412, over 4706619.05 frames. ], batch size: 80, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:35:10,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:12,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 07:35:13,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 07:35:15,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 07:35:16,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:19,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 07:35:21,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:35:23,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:35:28,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:35:31,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=294040.0, ans=0.2 2023-09-29 07:35:32,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 07:35:35,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:35:36,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 07:35:37,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:37,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 07:35:39,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=294106.6666666667, ans=0.05 2023-09-29 07:35:40,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=294106.6666666667, ans=0.125 2023-09-29 07:35:40,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=294106.6666666667, ans=0.125 2023-09-29 07:35:44,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 07:35:49,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.53 vs. limit=15.0 2023-09-29 07:35:52,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:52,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 07:35:52,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:53,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:53,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:35:57,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 07:36:01,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 07:36:04,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:36:04,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:04,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:06,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:36:06,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=294173.3333333333, ans=0.0 2023-09-29 07:36:08,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:36:10,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:36:11,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:36:17,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:18,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:36:20,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 07:36:20,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:36:22,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 07:36:28,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:30,239 INFO [train.py:1039] (2/4) Epoch 9, batch 1650, loss[loss=0.2858, simple_loss=0.3288, pruned_loss=0.1214, over 19407.00 frames. ], tot_loss[loss=0.214, simple_loss=0.2812, pruned_loss=0.07346, over 4715385.56 frames. ], batch size: 388, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:36:31,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:36:31,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:36:31,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 07:36:31,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 07:36:31,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 07:36:33,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 07:36:35,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:36,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:38,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:36:38,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:36:39,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:41,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 07:36:44,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:36:44,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:44,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:36:44,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:36:44,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 07:36:46,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 07:36:52,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:36:54,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=294373.3333333333, ans=0.2 2023-09-29 07:36:55,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:37:07,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 07:37:09,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:10,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 07:37:14,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:15,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:37:17,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:37:17,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:18,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:37:18,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:21,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:22,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:24,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:24,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:26,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:26,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:37:31,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:32,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 07:37:34,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:34,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 07:37:35,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 07:37:35,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 07:37:35,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:37,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:37:37,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:38,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:38,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 07:37:42,535 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.030e+02 2.311e+02 2.647e+02 4.475e+02, threshold=4.622e+02, percent-clipped=1.0 2023-09-29 07:37:42,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:44,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:37:44,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:46,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.85 vs. limit=15.0 2023-09-29 07:37:47,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 07:37:51,821 INFO [train.py:1039] (2/4) Epoch 9, batch 1700, loss[loss=0.1913, simple_loss=0.2638, pruned_loss=0.05946, over 24665.00 frames. ], tot_loss[loss=0.2136, simple_loss=0.2807, pruned_loss=0.0732, over 4723808.47 frames. ], batch size: 65, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:37:51,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:51,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:37:52,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 07:37:53,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:37:53,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:37:53,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:55,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:37:55,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:37:55,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 07:37:59,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:38:04,278 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.90 vs. limit=15.0 2023-09-29 07:38:06,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=294640.0, ans=0.125 2023-09-29 07:38:09,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:38:12,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:38:19,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:38:19,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:19,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:38:19,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:22,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 07:38:25,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:38:25,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:25,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=294773.3333333333, ans=0.125 2023-09-29 07:38:27,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:38:29,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:38:30,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 07:38:32,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 07:38:34,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:35,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=294773.3333333333, ans=0.07 2023-09-29 07:38:36,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 07:38:38,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:38:44,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=294840.0, ans=0.0 2023-09-29 07:38:45,036 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.06 vs. limit=15.0 2023-09-29 07:38:47,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:38:47,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:38:49,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:52,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:38:52,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 07:38:52,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:54,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:54,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 07:38:55,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:38:55,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:38:55,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:55,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:00,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:00,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:39:01,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:01,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:39:03,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:05,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:07,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 07:39:07,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=294906.6666666667, ans=0.2 2023-09-29 07:39:08,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:12,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:12,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 07:39:15,872 INFO [train.py:1039] (2/4) Epoch 9, batch 1750, loss[loss=0.1928, simple_loss=0.2772, pruned_loss=0.05421, over 24664.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.28, pruned_loss=0.07311, over 4724515.95 frames. ], batch size: 73, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:39:19,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:22,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:22,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:39:22,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 07:39:22,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=294973.3333333333, ans=0.0 2023-09-29 07:39:23,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:39:27,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:39:28,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:31,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 07:39:34,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:36,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 07:39:36,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:38,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:39:40,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=295040.0, ans=0.0 2023-09-29 07:39:42,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:39:42,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 07:39:45,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:39:45,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 07:39:46,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=295040.0, ans=0.125 2023-09-29 07:39:46,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=295040.0, ans=0.0 2023-09-29 07:39:53,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:39:55,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=295106.6666666667, ans=0.125 2023-09-29 07:39:56,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:39:56,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:40:01,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:01,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:40:03,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:06,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:08,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:09,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:40:09,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 07:40:13,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:15,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 07:40:17,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:18,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:20,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:40:23,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:40:23,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:40:25,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:25,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=295240.0, ans=0.1 2023-09-29 07:40:26,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:27,933 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.058e+02 2.373e+02 2.670e+02 4.900e+02, threshold=4.746e+02, percent-clipped=2.0 2023-09-29 07:40:31,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:32,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=295240.0, ans=0.07 2023-09-29 07:40:33,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=295240.0, ans=0.125 2023-09-29 07:40:34,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:40:35,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:40:35,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 07:40:35,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:37,699 INFO [train.py:1039] (2/4) Epoch 9, batch 1800, loss[loss=0.2123, simple_loss=0.2906, pruned_loss=0.06697, over 23988.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2793, pruned_loss=0.07244, over 4715270.80 frames. ], batch size: 86, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:40:37,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:40:37,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:37,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:40:37,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:40:39,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:40:42,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:40:43,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:44,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=295306.6666666667, ans=0.0 2023-09-29 07:40:46,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:40:47,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:51,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:40:53,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:56,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:40:59,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:59,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:41:00,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:41:02,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:41:02,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 07:41:02,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:05,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:10,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 07:41:12,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 07:41:12,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 07:41:12,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:15,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:41:15,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:41:16,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=295440.0, ans=0.125 2023-09-29 07:41:17,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:41:24,680 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 07:41:26,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:41:28,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:30,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 07:41:30,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 07:41:30,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:41:31,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:41:31,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:41:33,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=295506.6666666667, ans=0.125 2023-09-29 07:41:37,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 07:41:44,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:41:44,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 07:41:46,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:41:46,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:47,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:41:47,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 07:41:49,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:41:49,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:41:53,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 07:41:53,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:56,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:41:56,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:41:57,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:58,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:42:00,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:42:02,031 INFO [train.py:1039] (2/4) Epoch 9, batch 1850, loss[loss=0.217, simple_loss=0.2818, pruned_loss=0.07613, over 23736.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2796, pruned_loss=0.0722, over 4726106.81 frames. ], batch size: 212, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:42:02,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:42:03,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:06,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:42:08,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:14,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:42:16,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 07:42:20,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 07:42:21,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.42 vs. limit=12.0 2023-09-29 07:42:23,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 07:42:28,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:42:28,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 07:42:28,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:42:33,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=295773.3333333333, ans=0.0 2023-09-29 07:42:39,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:42:41,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 07:42:44,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:42:44,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:42:49,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 07:42:49,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:49,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:42:49,935 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.87 vs. limit=15.0 2023-09-29 07:42:50,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:42:52,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:55,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:57,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:42:57,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:59,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:42:59,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:01,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:02,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:43:06,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 07:43:07,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:09,209 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:43:10,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=295906.6666666667, ans=0.125 2023-09-29 07:43:13,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:43:13,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:43:13,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 07:43:13,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 07:43:14,723 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.027e+02 2.265e+02 2.527e+02 4.357e+02, threshold=4.531e+02, percent-clipped=0.0 2023-09-29 07:43:15,069 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 07:43:15,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=295906.6666666667, ans=0.09899494936611666 2023-09-29 07:43:15,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=295906.6666666667, ans=0.125 2023-09-29 07:43:16,581 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 07:43:18,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:43:19,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:43:19,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:19,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:19,676 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 07:43:19,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:43:21,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:21,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:43:22,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:43:24,098 INFO [train.py:1039] (2/4) Epoch 9, batch 1900, loss[loss=0.2152, simple_loss=0.3003, pruned_loss=0.06502, over 24546.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2806, pruned_loss=0.07247, over 4715927.20 frames. ], batch size: 71, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:43:24,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:43:24,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 07:43:25,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:25,852 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 07:43:25,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:43:27,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:32,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:35,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:43:37,534 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 07:43:37,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 07:43:39,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:40,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:43:40,753 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 07:43:41,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 07:43:45,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 07:43:46,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:43:50,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 07:43:53,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 07:43:57,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=296106.6666666667, ans=0.125 2023-09-29 07:44:05,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 07:44:08,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 07:44:08,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:09,719 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 07:44:09,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 07:44:09,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 07:44:11,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 07:44:11,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:44:15,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 07:44:20,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:44:21,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:21,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 07:44:23,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:44:26,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 07:44:28,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:34,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:44:34,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:44:34,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:44:35,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:44:37,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:44:37,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:44:41,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:44:44,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:44,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:44:46,173 INFO [train.py:1039] (2/4) Epoch 9, batch 1950, loss[loss=0.2316, simple_loss=0.2863, pruned_loss=0.08841, over 23759.00 frames. ], tot_loss[loss=0.2136, simple_loss=0.2811, pruned_loss=0.07306, over 4711456.12 frames. ], batch size: 212, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:44:47,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:44:47,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:47,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:49,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:50,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:44:55,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:44:56,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:56,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:44:58,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 07:44:58,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:44:58,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:00,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:04,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:45:04,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:04,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:06,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:10,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:45:10,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:45:10,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:45:11,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:11,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=296373.3333333333, ans=0.125 2023-09-29 07:45:15,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:17,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=296440.0, ans=0.125 2023-09-29 07:45:19,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:45:19,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:19,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:45:19,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 07:45:19,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:45:19,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:45:21,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:24,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:25,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:45:30,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:45:33,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:45:33,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:45:33,953 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.70 vs. limit=6.0 2023-09-29 07:45:34,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 07:45:34,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:45:39,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:39,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:45:40,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:45:42,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=296506.6666666667, ans=0.125 2023-09-29 07:45:49,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:50,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:53,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:55,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:59,266 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.955e+02 2.207e+02 2.649e+02 3.533e+02, threshold=4.414e+02, percent-clipped=0.0 2023-09-29 07:45:59,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:45:59,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:46:00,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 07:46:00,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:46:01,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=296573.3333333333, ans=0.0 2023-09-29 07:46:03,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:46:04,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 07:46:06,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:09,269 INFO [train.py:1039] (2/4) Epoch 9, batch 2000, loss[loss=0.2304, simple_loss=0.2994, pruned_loss=0.08064, over 23706.00 frames. ], tot_loss[loss=0.2149, simple_loss=0.2825, pruned_loss=0.07366, over 4709715.89 frames. ], batch size: 85, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:46:09,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:46:10,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:46:10,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:46:11,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:46:12,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:46:17,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 07:46:17,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:46:23,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:46:23,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=296706.6666666667, ans=0.07 2023-09-29 07:46:24,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.11 vs. limit=8.0 2023-09-29 07:46:25,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 07:46:26,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:46:26,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:30,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:46:31,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 07:46:35,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:39,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 07:46:39,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:46:39,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=296706.6666666667, ans=0.0 2023-09-29 07:46:42,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 07:46:42,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:45,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:46:46,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:46:46,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:48,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:46:49,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:46:49,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 07:46:50,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=296773.3333333333, ans=0.0 2023-09-29 07:46:53,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 07:46:53,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:53,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:46:57,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=296840.0, ans=0.125 2023-09-29 07:46:59,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=296840.0, ans=0.125 2023-09-29 07:47:01,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:02,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:47:02,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:02,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:47:04,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:04,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:06,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:06,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:06,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=296840.0, ans=0.125 2023-09-29 07:47:07,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:09,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=296840.0, ans=0.125 2023-09-29 07:47:11,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:47:12,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 07:47:15,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:47:16,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:47:24,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:25,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:25,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:27,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:47:27,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:47:29,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:29,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=296973.3333333333, ans=0.1 2023-09-29 07:47:31,100 INFO [train.py:1039] (2/4) Epoch 9, batch 2050, loss[loss=0.2172, simple_loss=0.2947, pruned_loss=0.06987, over 24285.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.2816, pruned_loss=0.07275, over 4716426.70 frames. ], batch size: 74, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:47:31,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:34,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:34,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:41,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:45,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=296973.3333333333, ans=0.0 2023-09-29 07:47:46,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:47:46,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:48,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:47:49,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 07:47:49,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:47:50,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:51,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:47:54,213 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.05 vs. limit=15.0 2023-09-29 07:47:56,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff3.min_abs, batch_count=297040.0, ans=0.2 2023-09-29 07:48:00,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:00,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:03,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 07:48:06,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:07,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 07:48:07,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:09,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:11,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:13,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:48:13,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:13,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=297106.6666666667, ans=0.0 2023-09-29 07:48:14,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:48:16,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:48:16,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:48:20,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=297173.3333333333, ans=0.2 2023-09-29 07:48:21,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:24,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:48:25,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:48:26,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:48:31,244 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.93 vs. limit=15.0 2023-09-29 07:48:32,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:48:36,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:48:36,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 07:48:42,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:44,025 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 2.106e+02 2.256e+02 2.757e+02 3.895e+02, threshold=4.512e+02, percent-clipped=0.0 2023-09-29 07:48:44,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:48:47,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:48:48,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 07:48:53,180 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 07:48:53,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:48:54,316 INFO [train.py:1039] (2/4) Epoch 9, batch 2100, loss[loss=0.2216, simple_loss=0.3004, pruned_loss=0.07144, over 23999.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2801, pruned_loss=0.07188, over 4719680.39 frames. ], batch size: 80, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:48:54,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:54,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:48:56,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:56,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 07:48:57,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 07:48:59,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:49:02,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:49:02,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:49:05,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:06,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:49:06,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 07:49:06,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:49:08,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 07:49:08,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 07:49:08,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=297373.3333333333, ans=0.1 2023-09-29 07:49:09,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:09,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:10,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 07:49:11,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 07:49:18,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 07:49:18,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:49:23,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:49:23,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:49:28,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:49:28,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=297440.0, ans=0.07 2023-09-29 07:49:29,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 07:49:30,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:30,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:49:32,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 07:49:32,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:32,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 07:49:33,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 07:49:33,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 07:49:35,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:49:36,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:49:41,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:41,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:42,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:44,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 07:49:44,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:44,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:46,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 07:49:48,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 07:49:48,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 07:49:53,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:49:56,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:56,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 07:50:03,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:06,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:50:06,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:06,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:06,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:50:08,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:08,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:10,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:50:10,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:50:11,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:13,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 07:50:14,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 07:50:14,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:16,062 INFO [train.py:1039] (2/4) Epoch 9, batch 2150, loss[loss=0.2193, simple_loss=0.2531, pruned_loss=0.09271, over 19041.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2791, pruned_loss=0.07162, over 4717010.38 frames. ], batch size: 388, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:50:19,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:50:19,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:50:19,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:50:19,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:50:25,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:50:26,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:26,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:28,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:50:28,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:28,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:50:33,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:33,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:50:33,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:50:38,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:38,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 07:50:42,724 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.10 vs. limit=22.5 2023-09-29 07:50:44,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:46,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:50:48,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:48,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:50:49,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:49,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:51,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:52,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 07:50:54,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:50:56,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:57,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:57,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:59,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:51:01,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:01,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:51:03,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:03,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 07:51:03,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:51:06,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:06,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:06,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=297840.0, ans=0.125 2023-09-29 07:51:09,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:09,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:51:11,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:12,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:12,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 07:51:15,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 07:51:15,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:51:16,544 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 07:51:17,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:17,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:51:19,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 07:51:19,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:51:19,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 07:51:19,461 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 07:51:19,462 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 07:51:20,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 07:51:22,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:22,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:51:22,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:51:24,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:25,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:51:27,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:27,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:28,410 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.066e+02 2.283e+02 2.527e+02 4.333e+02, threshold=4.566e+02, percent-clipped=0.0 2023-09-29 07:51:37,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:51:37,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 07:51:39,048 INFO [train.py:1039] (2/4) Epoch 9, batch 2200, loss[loss=0.1985, simple_loss=0.2733, pruned_loss=0.06187, over 24479.00 frames. ], tot_loss[loss=0.2112, simple_loss=0.2796, pruned_loss=0.0714, over 4725741.21 frames. ], batch size: 66, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:51:40,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:51:46,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:47,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:51:47,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:49,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:51:50,005 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:51:50,076 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:51:51,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=297973.3333333333, ans=0.125 2023-09-29 07:51:52,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:53,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=297973.3333333333, ans=0.0 2023-09-29 07:51:54,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:54,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 07:51:56,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.80 vs. limit=15.0 2023-09-29 07:51:58,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 07:51:59,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=298040.0, ans=0.0 2023-09-29 07:52:00,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:52:07,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 07:52:09,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.29 vs. limit=15.0 2023-09-29 07:52:10,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:11,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:13,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:52:15,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:52:16,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 07:52:20,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:52:21,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:23,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:52:26,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:52:28,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:30,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:52:31,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:32,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=298173.3333333333, ans=0.125 2023-09-29 07:52:33,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 07:52:34,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:36,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 07:52:38,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:38,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:52:39,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:42,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:42,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:42,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:42,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:44,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:52:44,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:52:47,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:52:53,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:52:53,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:52:56,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:52:57,817 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 07:53:00,821 INFO [train.py:1039] (2/4) Epoch 9, batch 2250, loss[loss=0.2067, simple_loss=0.2782, pruned_loss=0.06761, over 24455.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2801, pruned_loss=0.07145, over 4731510.54 frames. ], batch size: 66, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:53:00,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:53:01,660 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 07:53:02,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:53:03,062 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 07:53:04,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:06,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:53:06,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:07,799 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 07:53:08,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=298306.6666666667, ans=0.2 2023-09-29 07:53:09,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:53:09,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=298306.6666666667, ans=0.1 2023-09-29 07:53:12,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:18,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:53:20,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:53:24,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:24,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:25,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:27,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 07:53:27,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:29,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:53:30,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 07:53:32,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:53:32,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:33,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:37,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:53:40,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:53:40,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:53:40,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 07:53:42,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:45,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:53:51,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:53,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:54,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:54,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:58,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:54:00,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:54:05,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:54:05,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:54:12,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:54:12,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:54:13,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:54:15,288 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.931e+02 2.128e+02 2.517e+02 4.313e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 07:54:18,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:54:20,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:54:20,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 07:54:21,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:21,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:54:23,833 INFO [train.py:1039] (2/4) Epoch 9, batch 2300, loss[loss=0.1832, simple_loss=0.2607, pruned_loss=0.05288, over 24335.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.28, pruned_loss=0.0716, over 4723926.38 frames. ], batch size: 61, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:54:25,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 07:54:28,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:54:30,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:35,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:37,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:54:40,811 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 07:54:42,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:49,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:54:49,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:54:49,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:54:50,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:50,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 07:54:50,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:54:53,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:54:53,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:54:57,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:55:00,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:55:02,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:07,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:55:07,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:55:10,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:55:10,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=298773.3333333333, ans=0.0 2023-09-29 07:55:14,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:55:17,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:55:19,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:55:19,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:55:19,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 07:55:24,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:55:24,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:24,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:24,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:55:25,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:27,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:55:27,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:55:27,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 07:55:27,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:55:27,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:27,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 07:55:34,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:55:40,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:55:44,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:44,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:55:44,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:55:47,457 INFO [train.py:1039] (2/4) Epoch 9, batch 2350, loss[loss=0.2073, simple_loss=0.2717, pruned_loss=0.07143, over 24353.00 frames. ], tot_loss[loss=0.2126, simple_loss=0.2813, pruned_loss=0.07195, over 4729126.05 frames. ], batch size: 61, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:55:47,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:55:47,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:55:47,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:55:47,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 07:55:55,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:55:55,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 07:56:02,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 07:56:07,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:56:10,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:11,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:11,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 07:56:15,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:56:20,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 07:56:22,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:24,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.23 vs. limit=12.0 2023-09-29 07:56:25,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:56:25,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:56:28,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:56:30,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 07:56:30,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:56:33,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:33,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:56:33,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:56:37,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:56:40,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 07:56:40,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:56:43,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:43,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:56:45,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 07:56:47,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:56:49,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 07:56:50,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:56:54,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 07:56:55,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=299240.0, ans=0.1 2023-09-29 07:56:58,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 07:57:00,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:57:00,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:57:00,315 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 07:57:00,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 07:57:01,788 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.074e+02 2.290e+02 2.554e+02 3.364e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-29 07:57:04,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 07:57:05,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:57:07,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=299240.0, ans=0.1 2023-09-29 07:57:10,737 INFO [train.py:1039] (2/4) Epoch 9, batch 2400, loss[loss=0.2228, simple_loss=0.29, pruned_loss=0.07783, over 23334.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.2808, pruned_loss=0.07214, over 4721793.22 frames. ], batch size: 93, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:57:10,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:57:13,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:57:15,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=299306.6666666667, ans=0.2 2023-09-29 07:57:16,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:57:17,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 07:57:17,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 07:57:26,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:57:26,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:57:28,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 07:57:28,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:57:28,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=299373.3333333333, ans=0.125 2023-09-29 07:57:29,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:31,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 07:57:33,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=299373.3333333333, ans=0.125 2023-09-29 07:57:37,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=299373.3333333333, ans=0.1 2023-09-29 07:57:38,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:38,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 07:57:43,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:57:47,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 07:57:50,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:57:50,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=299440.0, ans=0.125 2023-09-29 07:57:51,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:57,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:57:58,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 07:57:58,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:58:03,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:05,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.73 vs. limit=12.0 2023-09-29 07:58:06,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:12,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:13,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:58:13,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:58:13,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:58:14,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:15,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:15,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:58:18,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:58:20,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:58:20,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 07:58:21,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 07:58:23,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:58:23,856 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.50 vs. limit=15.0 2023-09-29 07:58:24,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:24,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 07:58:26,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 07:58:26,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 07:58:26,166 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 07:58:26,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 07:58:26,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=299573.3333333333, ans=0.1 2023-09-29 07:58:27,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:58:29,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:30,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:31,570 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 07:58:31,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:31,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:58:33,156 INFO [train.py:1039] (2/4) Epoch 9, batch 2450, loss[loss=0.2188, simple_loss=0.2773, pruned_loss=0.0801, over 23308.00 frames. ], tot_loss[loss=0.2117, simple_loss=0.28, pruned_loss=0.07172, over 4717351.75 frames. ], batch size: 119, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:58:36,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:58:36,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:36,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=299640.0, ans=0.125 2023-09-29 07:58:42,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:42,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:44,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 07:58:50,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:50,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:54,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:58:54,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:58:54,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:58:54,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 07:58:58,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:59,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:59:01,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:59:04,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:59:06,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:06,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:07,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:59:10,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 07:59:12,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:59:20,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:21,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:59:21,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:21,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:59:21,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:23,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:59:24,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 07:59:26,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:27,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:59:28,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=299840.0, ans=0.2 2023-09-29 07:59:30,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:59:30,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:31,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=299840.0, ans=0.125 2023-09-29 07:59:34,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=299840.0, ans=0.125 2023-09-29 07:59:36,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:59:36,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 07:59:37,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:59:39,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:59:39,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 07:59:40,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:59:42,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:59:46,463 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 2.171e+02 2.453e+02 2.838e+02 4.289e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-29 07:59:46,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:59:48,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:49,150 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:59:50,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:59:53,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 07:59:54,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:59:56,024 INFO [train.py:1039] (2/4) Epoch 9, batch 2500, loss[loss=0.2016, simple_loss=0.2817, pruned_loss=0.06071, over 24461.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2794, pruned_loss=0.07117, over 4725118.53 frames. ], batch size: 63, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 08:00:00,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:12,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:00:12,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:00:14,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:14,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 08:00:21,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:00:21,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:00:23,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:00:23,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:00:23,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 08:00:23,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:25,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:25,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 08:00:25,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:25,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=300040.0, ans=0.125 2023-09-29 08:00:28,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 08:00:28,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:31,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=300106.6666666667, ans=0.1 2023-09-29 08:00:33,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:00:33,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:36,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:00:38,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 08:00:38,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:00:41,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:44,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:49,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:49,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=300173.3333333333, ans=0.2 2023-09-29 08:00:52,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:00:57,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:01:01,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 08:01:01,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:01,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:01,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=300240.0, ans=0.0 2023-09-29 08:01:04,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:01:04,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:01:04,625 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 08:01:04,626 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 08:01:04,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 08:01:08,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:09,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 08:01:09,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 08:01:10,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=300240.0, ans=0.125 2023-09-29 08:01:11,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:01:11,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 08:01:13,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=300240.0, ans=0.125 2023-09-29 08:01:14,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 08:01:16,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:18,154 INFO [train.py:1039] (2/4) Epoch 9, batch 2550, loss[loss=0.2836, simple_loss=0.3093, pruned_loss=0.1289, over 19073.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2798, pruned_loss=0.07174, over 4713238.33 frames. ], batch size: 388, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:01:19,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:01:19,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:01:20,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=300306.6666666667, ans=0.125 2023-09-29 08:01:22,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:22,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 08:01:22,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:01:27,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 08:01:27,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:01:30,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:34,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:34,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:01:35,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:01:36,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:01:37,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:40,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:01:40,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 08:01:42,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:42,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:42,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 08:01:44,254 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=300373.3333333333, ans=0.125 2023-09-29 08:01:56,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:02:02,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:04,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:04,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:02:04,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:02:06,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.61 vs. limit=15.0 2023-09-29 08:02:11,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:02:13,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:02:15,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:02:15,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:02:15,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:02:16,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:02:19,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:19,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:24,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:02:26,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 08:02:26,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:02:26,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:26,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:02:29,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:02:29,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:30,790 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.909e+02 2.105e+02 2.404e+02 4.394e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 08:02:35,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:02:37,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:37,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten.whitening_limit, batch_count=300640.0, ans=22.5 2023-09-29 08:02:39,445 INFO [train.py:1039] (2/4) Epoch 9, batch 2600, loss[loss=0.1785, simple_loss=0.2582, pruned_loss=0.04941, over 24466.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2795, pruned_loss=0.07157, over 4716821.63 frames. ], batch size: 63, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:02:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 08:02:42,738 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 08:02:42,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:02:42,827 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 08:02:42,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 08:02:44,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 08:02:47,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:47,222 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 08:02:48,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 08:02:49,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.74 vs. limit=15.0 2023-09-29 08:02:50,086 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 08:02:50,201 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:02:53,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:02:53,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=300640.0, ans=0.2 2023-09-29 08:02:54,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 08:02:56,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 08:02:58,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:02:58,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 08:03:01,293 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 08:03:01,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 08:03:07,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:07,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:07,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:07,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 08:03:09,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:03:14,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=300773.3333333333, ans=0.07 2023-09-29 08:03:17,415 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 08:03:24,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:24,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:26,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 08:03:27,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:27,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:27,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 08:03:27,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=300840.0, ans=0.125 2023-09-29 08:03:29,829 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.65 vs. limit=15.0 2023-09-29 08:03:30,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:03:30,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:03:34,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:39,113 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 08:03:39,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:40,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:03:40,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=300840.0, ans=0.125 2023-09-29 08:03:45,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:47,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:03:47,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 08:03:47,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:49,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:03:50,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:03:57,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 08:03:58,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:00,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:04:01,773 INFO [train.py:1039] (2/4) Epoch 9, batch 2650, loss[loss=0.2099, simple_loss=0.2725, pruned_loss=0.07366, over 23731.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2799, pruned_loss=0.07207, over 4708415.04 frames. ], batch size: 179, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:04:03,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 08:04:03,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:05,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:04:05,239 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 08:04:05,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:06,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=300973.3333333333, ans=0.125 2023-09-29 08:04:08,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:11,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:04:13,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:04:16,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:04:16,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 08:04:16,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:04:17,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:04:20,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 08:04:23,093 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 08:04:26,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:27,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 08:04:27,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:29,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 08:04:35,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:04:35,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:39,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 08:04:39,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 08:04:43,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:04:47,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 08:04:47,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:49,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:49,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=301173.3333333333, ans=0.1 2023-09-29 08:04:50,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:04:50,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:52,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:53,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:55,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:04:55,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:55,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:04:55,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=301173.3333333333, ans=0.125 2023-09-29 08:04:57,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:04:59,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:59,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:05:00,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:02,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:05:02,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:05:06,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:07,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:05:07,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:07,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 08:05:08,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=301240.0, ans=0.1 2023-09-29 08:05:13,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:05:13,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:15,910 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.153e+02 2.553e+02 3.125e+02 4.988e+02, threshold=5.107e+02, percent-clipped=5.0 2023-09-29 08:05:17,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:17,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:19,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:05:20,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:22,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:22,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 08:05:23,522 INFO [train.py:1039] (2/4) Epoch 9, batch 2700, loss[loss=0.217, simple_loss=0.2981, pruned_loss=0.06791, over 24589.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2815, pruned_loss=0.07209, over 4719517.13 frames. ], batch size: 71, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:05:26,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:05:28,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:05:28,540 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:05:30,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:05:30,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:30,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:31,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:05:31,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:31,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:05:31,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:05:31,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 08:05:33,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:05:36,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:05:38,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:05:38,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:45,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:05:45,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 08:05:46,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:05:52,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:05:52,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:05:55,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=301440.0, ans=0.2 2023-09-29 08:05:58,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:05:58,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:58,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:05:58,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:06:02,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:05,159 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.99 vs. limit=22.5 2023-09-29 08:06:05,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:06,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:06:06,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:06:10,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:10,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:06:19,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:06:21,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:06:25,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:06:25,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:27,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:29,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:30,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:31,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=301573.3333333333, ans=0.1 2023-09-29 08:06:32,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:33,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:33,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:06:35,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=301573.3333333333, ans=0.125 2023-09-29 08:06:36,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:06:38,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:38,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:40,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 08:06:42,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:43,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:06:43,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 08:06:43,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=301640.0, ans=0.0 2023-09-29 08:06:45,106 INFO [train.py:1039] (2/4) Epoch 9, batch 2750, loss[loss=0.2488, simple_loss=0.2884, pruned_loss=0.1046, over 19613.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2807, pruned_loss=0.07206, over 4718787.77 frames. ], batch size: 388, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:06:45,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 08:06:47,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:50,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:06:50,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:54,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:54,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:06:54,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:57,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:06:57,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=301640.0, ans=0.0 2023-09-29 08:06:58,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:07:00,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:07:00,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:00,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 08:07:00,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:07:00,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:07:04,682 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.63 vs. limit=15.0 2023-09-29 08:07:05,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 08:07:08,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:07:08,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:10,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:10,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:07:10,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:07:11,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:07:11,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:12,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=301706.6666666667, ans=0.1 2023-09-29 08:07:13,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:18,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:07:18,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:07:18,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:07:20,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:21,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:07:30,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:30,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=301773.3333333333, ans=0.125 2023-09-29 08:07:31,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:07:33,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:38,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:38,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:07:38,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:07:42,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.81 vs. limit=6.0 2023-09-29 08:07:43,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:07:43,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:43,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 08:07:48,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:50,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 08:07:56,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:07:57,321 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.03 vs. limit=22.5 2023-09-29 08:07:59,502 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.994e+02 2.310e+02 2.732e+02 5.086e+02, threshold=4.620e+02, percent-clipped=0.0 2023-09-29 08:08:01,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:08:01,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 08:08:03,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:03,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:08:04,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 08:08:04,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:08:06,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:08:06,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:08,104 INFO [train.py:1039] (2/4) Epoch 9, batch 2800, loss[loss=0.1988, simple_loss=0.2657, pruned_loss=0.06595, over 18877.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2786, pruned_loss=0.07154, over 4713553.20 frames. ], batch size: 41, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:08:08,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:08:10,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 08:08:10,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:11,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:13,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:14,764 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 08:08:14,765 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 08:08:15,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=301973.3333333333, ans=0.2 2023-09-29 08:08:15,508 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.72 vs. limit=22.5 2023-09-29 08:08:17,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:21,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:08:21,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:08:24,940 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:08:26,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:08:26,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 08:08:29,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:08:30,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 08:08:31,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:31,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:08:31,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:36,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:08:36,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:36,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:08:38,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:08:47,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:08:49,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:49,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=302106.6666666667, ans=0.0 2023-09-29 08:08:51,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:52,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:54,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:59,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:08:59,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 08:09:00,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:01,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:01,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:09:04,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:05,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:10,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:09:12,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:09:12,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:12,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:09:12,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:09:13,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:09:15,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:09:15,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 08:09:15,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:17,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:09:17,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:19,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 08:09:21,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:21,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:09:22,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:09:22,629 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.33 vs. limit=12.0 2023-09-29 08:09:23,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 08:09:30,442 INFO [train.py:1039] (2/4) Epoch 9, batch 2850, loss[loss=0.1835, simple_loss=0.2567, pruned_loss=0.0552, over 24577.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.278, pruned_loss=0.07108, over 4710045.82 frames. ], batch size: 60, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:09:30,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:30,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:09:32,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:09:33,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:37,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:09:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:09:37,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=302306.6666666667, ans=0.125 2023-09-29 08:09:37,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=302306.6666666667, ans=0.0 2023-09-29 08:09:38,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:40,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:41,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:44,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:09:44,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 08:09:50,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 08:09:50,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:53,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 08:09:55,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:56,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 08:09:59,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 08:09:59,893 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.37 vs. limit=15.0 2023-09-29 08:10:00,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:02,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=302440.0, ans=0.125 2023-09-29 08:10:12,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=302440.0, ans=0.0 2023-09-29 08:10:13,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:14,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:14,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:10:16,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:10:16,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:10:16,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:10:18,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:10:18,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 08:10:19,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:10:21,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:10:21,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:21,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:21,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=302506.6666666667, ans=0.0 2023-09-29 08:10:24,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:24,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:26,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:28,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:29,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:10:30,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=302506.6666666667, ans=0.125 2023-09-29 08:10:31,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:34,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:36,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:10:41,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:10:43,751 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.988e+02 2.184e+02 2.463e+02 3.940e+02, threshold=4.369e+02, percent-clipped=0.0 2023-09-29 08:10:43,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 08:10:43,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 08:10:45,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:10:47,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:47,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 08:10:49,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:10:49,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:49,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:10:50,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:10:50,639 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 08:10:50,727 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 08:10:50,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:10:51,989 INFO [train.py:1039] (2/4) Epoch 9, batch 2900, loss[loss=0.2277, simple_loss=0.2965, pruned_loss=0.07947, over 23291.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2786, pruned_loss=0.07092, over 4715372.76 frames. ], batch size: 93, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:10:52,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:56,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:10:57,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:10:59,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:10:59,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 08:11:04,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:04,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 08:11:06,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 08:11:06,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:11:06,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:11:07,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:10,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:11:14,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:11:14,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:16,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=302706.6666666667, ans=0.2 2023-09-29 08:11:17,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:11:18,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 08:11:20,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:11:20,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:23,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 08:11:25,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 08:11:28,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:11:28,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 08:11:28,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:11:29,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.47 vs. limit=15.0 2023-09-29 08:11:30,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=302773.3333333333, ans=0.125 2023-09-29 08:11:31,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:11:31,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:11:33,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:35,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:39,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:11:40,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=302840.0, ans=0.2 2023-09-29 08:11:41,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:11:42,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 08:11:44,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 08:11:44,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:11:48,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:11:51,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 08:11:53,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:11:57,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:12:07,582 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.76 vs. limit=15.0 2023-09-29 08:12:08,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:12:08,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:12:09,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 08:12:14,609 INFO [train.py:1039] (2/4) Epoch 9, batch 2950, loss[loss=0.192, simple_loss=0.2673, pruned_loss=0.05838, over 24440.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2796, pruned_loss=0.07096, over 4712458.47 frames. ], batch size: 63, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:12:14,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:14,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 08:12:14,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:15,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=302973.3333333333, ans=0.125 2023-09-29 08:12:16,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:12:21,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:23,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 08:12:24,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:24,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:26,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:12:27,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:12:29,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 08:12:30,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 08:12:30,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:12:30,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:37,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:12:39,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:12:41,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:12:41,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:12:46,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:12:46,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:12:47,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:12:52,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 08:12:57,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 08:12:57,984 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 08:12:59,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:13:00,860 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 08:13:02,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 08:13:02,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:03,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:13:03,899 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 08:13:03,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:13:04,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=303173.3333333333, ans=0.2 2023-09-29 08:13:09,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 08:13:09,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:13:10,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:13:14,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:15,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:13:16,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:16,441 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 08:13:16,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:16,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 08:13:23,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:23,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:13:24,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=303240.0, ans=0.0 2023-09-29 08:13:25,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 08:13:25,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:13:26,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 08:13:28,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:29,728 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.974e+02 2.174e+02 2.569e+02 4.331e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-29 08:13:30,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:13:31,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:13:33,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:33,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:13:35,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:13:36,349 INFO [train.py:1039] (2/4) Epoch 9, batch 3000, loss[loss=0.1919, simple_loss=0.2789, pruned_loss=0.05247, over 24308.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2801, pruned_loss=0.07137, over 4701973.59 frames. ], batch size: 74, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:13:36,350 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 08:13:49,671 INFO [train.py:1071] (2/4) Epoch 9, validation: loss=0.2838, simple_loss=0.2753, pruned_loss=0.1462, over 1125622.00 frames. 2023-09-29 08:13:49,672 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 08:13:49,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:49,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:13:49,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:13:49,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:52,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:13:53,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:53,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 08:13:55,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:58,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:59,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:14:00,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=303306.6666666667, ans=0.2 2023-09-29 08:14:01,529 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 08:14:01,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 08:14:02,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.53 vs. limit=6.0 2023-09-29 08:14:05,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:14:06,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:14:06,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 08:14:08,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:14,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:14:25,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:14:30,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 08:14:31,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:14:33,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:14:33,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:33,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:14:36,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:37,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 08:14:40,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 08:14:40,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:14:41,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:14:43,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:14:43,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:44,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:14:44,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:14:49,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:14:49,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:49,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:14:52,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:55,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 08:14:57,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:14:57,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=303573.3333333333, ans=0.0 2023-09-29 08:14:58,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:14:58,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:15:03,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:03,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:04,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:15:04,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 08:15:06,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:06,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 08:15:06,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:15:06,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=303573.3333333333, ans=0.025 2023-09-29 08:15:07,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 08:15:10,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:10,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:15:10,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 08:15:12,223 INFO [train.py:1039] (2/4) Epoch 9, batch 3050, loss[loss=0.2178, simple_loss=0.2967, pruned_loss=0.06949, over 24633.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.2809, pruned_loss=0.07181, over 4707529.47 frames. ], batch size: 68, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:15:12,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 08:15:12,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:15:13,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:15:15,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:15,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:15:15,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:16,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:15:19,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 08:15:21,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:15:24,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:24,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:15:29,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:31,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 08:15:38,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 08:15:38,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 08:15:39,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:41,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=303706.6666666667, ans=0.1 2023-09-29 08:15:42,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:15:45,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:45,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:45,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:49,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:15:50,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:50,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:50,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:50,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:52,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:52,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=303773.3333333333, ans=0.125 2023-09-29 08:15:52,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=303773.3333333333, ans=0.125 2023-09-29 08:15:56,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:59,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:00,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 08:16:00,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:16:00,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:16:04,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:16:06,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:16:06,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:07,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:12,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:16:13,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:18,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:19,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:16:19,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:21,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:21,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:16:21,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:16:23,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 08:16:25,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:26,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:26,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 08:16:28,061 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 1.995e+02 2.261e+02 2.647e+02 3.760e+02, threshold=4.522e+02, percent-clipped=0.0 2023-09-29 08:16:28,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:28,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=303906.6666666667, ans=0.1 2023-09-29 08:16:35,199 INFO [train.py:1039] (2/4) Epoch 9, batch 3100, loss[loss=0.2108, simple_loss=0.2569, pruned_loss=0.08228, over 23462.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2802, pruned_loss=0.07178, over 4714205.21 frames. ], batch size: 285, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:16:35,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:36,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:16:40,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:16:42,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 08:16:42,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=303973.3333333333, ans=0.125 2023-09-29 08:16:44,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 08:16:45,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 08:16:47,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:16:50,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:50,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:53,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:16:58,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:03,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 08:17:08,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:17:10,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:10,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:10,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:17:11,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:17:14,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:17:14,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 08:17:14,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:17:15,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:16,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=304106.6666666667, ans=0.125 2023-09-29 08:17:17,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 08:17:18,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:17:23,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:17:23,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=304173.3333333333, ans=0.125 2023-09-29 08:17:24,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 08:17:24,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 08:17:26,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:26,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:26,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=304173.3333333333, ans=0.0 2023-09-29 08:17:28,855 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.83 vs. limit=10.0 2023-09-29 08:17:29,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:29,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:29,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:17:31,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:17:31,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:17:33,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:17:33,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:17:33,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:33,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:17:38,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:39,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 08:17:42,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:17:44,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 08:17:45,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:45,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:47,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 08:17:55,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 08:17:56,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=304306.6666666667, ans=0.125 2023-09-29 08:17:57,808 INFO [train.py:1039] (2/4) Epoch 9, batch 3150, loss[loss=0.2248, simple_loss=0.2793, pruned_loss=0.08515, over 24053.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2787, pruned_loss=0.0716, over 4703854.03 frames. ], batch size: 196, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:17:57,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:17:59,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:01,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:18:01,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:18:02,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 08:18:04,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:18:07,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 08:18:08,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.70 vs. limit=12.0 2023-09-29 08:18:09,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:11,211 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 08:18:14,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 08:18:14,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:18:14,418 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 08:18:15,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:18:18,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 08:18:19,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 08:18:19,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 08:18:19,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:19,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:20,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:23,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 08:18:26,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:26,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:27,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:29,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:18:32,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 08:18:33,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:18:36,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:18:38,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:38,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 08:18:40,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 08:18:42,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:18:42,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:18:42,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:18:42,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:42,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:18:43,268 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.73 vs. limit=22.5 2023-09-29 08:18:44,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=304440.0, ans=0.2 2023-09-29 08:18:45,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:18:45,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:18:45,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 08:18:46,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:18:47,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:48,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:18:49,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:51,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 08:18:51,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:18:51,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=304506.6666666667, ans=0.1 2023-09-29 08:18:53,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 08:18:53,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:55,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 08:18:57,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 08:18:57,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:18:58,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:00,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 08:19:01,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 08:19:01,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:19:02,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=304573.3333333333, ans=0.0 2023-09-29 08:19:03,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:19:05,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:06,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:19:13,064 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 2.121e+02 2.392e+02 3.280e+02 6.565e+02, threshold=4.784e+02, percent-clipped=9.0 2023-09-29 08:19:13,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:19:13,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:16,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 08:19:19,797 INFO [train.py:1039] (2/4) Epoch 9, batch 3200, loss[loss=0.2266, simple_loss=0.2979, pruned_loss=0.07768, over 24576.00 frames. ], tot_loss[loss=0.2099, simple_loss=0.2776, pruned_loss=0.07113, over 4705320.20 frames. ], batch size: 71, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:19:21,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:19:21,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:19:26,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:26,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=304640.0, ans=0.125 2023-09-29 08:19:28,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:19:28,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 08:19:30,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:32,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=304640.0, ans=0.125 2023-09-29 08:19:36,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:19:38,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:43,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=304706.6666666667, ans=0.1 2023-09-29 08:19:48,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:19:51,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=304706.6666666667, ans=0.0 2023-09-29 08:19:58,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 08:19:58,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:20:01,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 08:20:02,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:20:07,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:20:07,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:20:08,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:20:13,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 08:20:13,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:20:16,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 08:20:20,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 08:20:23,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:20:28,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:28,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:20:28,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=304906.6666666667, ans=0.0 2023-09-29 08:20:29,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:29,746 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 08:20:29,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:20:31,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=304906.6666666667, ans=0.2 2023-09-29 08:20:32,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:20:36,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 08:20:36,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 08:20:38,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 08:20:39,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 08:20:41,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:20:43,055 INFO [train.py:1039] (2/4) Epoch 9, batch 3250, loss[loss=0.21, simple_loss=0.2905, pruned_loss=0.06481, over 24426.00 frames. ], tot_loss[loss=0.2107, simple_loss=0.278, pruned_loss=0.07172, over 4703465.92 frames. ], batch size: 69, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:20:43,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:20:44,844 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 08:20:44,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:20:44,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:20:45,079 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 08:20:49,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:20:52,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:21:01,866 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.90 vs. limit=22.5 2023-09-29 08:21:02,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:02,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 08:21:02,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:04,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:21:04,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:05,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:05,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:21:09,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:09,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:21:11,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:11,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:11,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.00 vs. limit=22.5 2023-09-29 08:21:15,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:15,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=305106.6666666667, ans=0.125 2023-09-29 08:21:17,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:18,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:18,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:20,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:21,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:21,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:26,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 08:21:26,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:21:26,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:21:28,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:28,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:21:34,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:21:39,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=305173.3333333333, ans=0.015 2023-09-29 08:21:39,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=305173.3333333333, ans=0.0 2023-09-29 08:21:41,784 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.00 vs. limit=15.0 2023-09-29 08:21:46,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:21:46,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:46,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 08:21:46,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:21:46,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:21:47,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:49,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 08:21:49,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 08:21:50,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:50,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:52,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:52,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:21:54,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:54,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=305240.0, ans=0.2 2023-09-29 08:21:57,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:58,892 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.015e+02 2.327e+02 2.716e+02 4.299e+02, threshold=4.655e+02, percent-clipped=0.0 2023-09-29 08:21:59,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:01,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 08:22:01,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:02,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:22:02,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 08:22:06,379 INFO [train.py:1039] (2/4) Epoch 9, batch 3300, loss[loss=0.2181, simple_loss=0.2838, pruned_loss=0.07624, over 23412.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.279, pruned_loss=0.07182, over 4703409.75 frames. ], batch size: 106, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:22:06,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:22:06,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 08:22:09,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 08:22:11,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 08:22:11,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:17,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:18,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:22:18,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:19,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:22:21,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:22:21,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:22,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:22:27,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 08:22:29,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:22:29,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:30,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:32,106 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 08:22:33,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:22:33,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:22:35,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:22:35,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:22:35,964 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 08:22:39,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:39,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:22:42,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:42,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 08:22:44,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 08:22:44,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:44,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:22:47,405 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 08:22:48,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 08:22:48,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:22:52,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 08:22:54,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:22:57,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:22:58,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:00,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:01,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:01,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:23:01,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:23:03,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:23:03,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:05,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:23:07,097 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 08:23:09,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 08:23:10,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:23:10,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:10,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:13,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:13,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:15,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:23:17,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:17,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:23:18,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:20,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:23:23,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 08:23:23,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:25,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:23:25,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:23:28,433 INFO [train.py:1039] (2/4) Epoch 9, batch 3350, loss[loss=0.21, simple_loss=0.2788, pruned_loss=0.07063, over 23496.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2793, pruned_loss=0.07185, over 4715547.65 frames. ], batch size: 134, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:23:28,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:30,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:30,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:33,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:34,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:36,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:23:38,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:38,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:23:40,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:42,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:23:44,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 08:23:45,646 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 08:23:45,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:49,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 08:23:49,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 08:23:49,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:23:49,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:23:52,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:52,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 08:23:52,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=15.0 2023-09-29 08:23:53,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:53,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:23:56,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:59,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:59,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:59,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=305706.6666666667, ans=0.5 2023-09-29 08:24:00,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:24:05,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:06,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:07,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:10,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=305773.3333333333, ans=0.125 2023-09-29 08:24:11,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:24:13,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:24:16,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:16,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:19,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:21,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 08:24:23,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:24:23,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 08:24:23,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:24:23,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=305840.0, ans=0.125 2023-09-29 08:24:24,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 08:24:25,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:27,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:33,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:34,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 08:24:34,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:24:37,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:24:37,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:24:43,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:24:43,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=305906.6666666667, ans=0.125 2023-09-29 08:24:44,734 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 2.029e+02 2.244e+02 2.615e+02 3.935e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 08:24:44,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 08:24:45,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=305906.6666666667, ans=0.0 2023-09-29 08:24:46,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:24:46,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:24:49,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:50,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 08:24:50,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:51,466 INFO [train.py:1039] (2/4) Epoch 9, batch 3400, loss[loss=0.2341, simple_loss=0.3049, pruned_loss=0.0816, over 24335.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2812, pruned_loss=0.07247, over 4707065.37 frames. ], batch size: 77, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:24:51,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 08:24:53,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:24:55,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:24:56,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 08:25:00,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 08:25:00,807 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 08:25:00,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:04,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=305973.3333333333, ans=0.125 2023-09-29 08:25:05,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:25:05,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:25:06,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:08,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:25:13,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:16,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 08:25:17,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=306040.0, ans=0.125 2023-09-29 08:25:22,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:25:24,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:25,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:25,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=306106.6666666667, ans=0.125 2023-09-29 08:25:26,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:25:34,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:25:36,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=306106.6666666667, ans=0.09899494936611666 2023-09-29 08:25:38,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 08:25:46,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 08:25:46,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:25:47,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:48,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:48,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:25:51,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:54,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:25:54,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:26:01,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:03,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 08:26:05,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=306240.0, ans=0.0 2023-09-29 08:26:09,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:26:13,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 08:26:15,218 INFO [train.py:1039] (2/4) Epoch 9, batch 3450, loss[loss=0.2057, simple_loss=0.2699, pruned_loss=0.07072, over 23413.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.2815, pruned_loss=0.07221, over 4718288.52 frames. ], batch size: 134, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:26:16,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 08:26:18,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:26:19,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:26:19,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 08:26:20,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:23,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:26:24,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=306306.6666666667, ans=0.125 2023-09-29 08:26:27,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=306306.6666666667, ans=0.09899494936611666 2023-09-29 08:26:30,313 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:26:32,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:26:33,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:33,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:26:33,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:38,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:45,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 08:26:50,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 08:26:50,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:26:50,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:26:52,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:57,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 08:26:58,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:27:02,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:02,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:27:03,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:27:05,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:27:07,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 08:27:07,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:07,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:27:07,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=306506.6666666667, ans=0.125 2023-09-29 08:27:10,211 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.91 vs. limit=10.0 2023-09-29 08:27:10,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:27:13,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 08:27:18,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:27:22,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:27:22,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:23,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.07 vs. limit=15.0 2023-09-29 08:27:27,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:32,306 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.104e+02 2.323e+02 2.853e+02 3.879e+02, threshold=4.645e+02, percent-clipped=0.0 2023-09-29 08:27:32,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:32,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:32,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:27:34,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:38,559 INFO [train.py:1039] (2/4) Epoch 9, batch 3500, loss[loss=0.2115, simple_loss=0.2711, pruned_loss=0.07596, over 23881.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2789, pruned_loss=0.07194, over 4692101.75 frames. ], batch size: 195, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:27:38,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:40,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=306640.0, ans=0.07 2023-09-29 08:27:43,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:27:44,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 08:27:47,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:27:52,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:27:52,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=306640.0, ans=0.125 2023-09-29 08:27:52,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=306640.0, ans=0.2 2023-09-29 08:27:53,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:53,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 08:27:57,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=306706.6666666667, ans=0.125 2023-09-29 08:27:58,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:28:00,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:28:02,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:28:02,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:03,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:28:03,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:03,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:05,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 08:28:08,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:09,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:28:11,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:14,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:16,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 08:28:16,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:20,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:21,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:28:21,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:23,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:28:25,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:25,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 08:28:28,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 08:28:28,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 08:28:28,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:29,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:29,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:30,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:28:35,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:28:35,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:28:38,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=306840.0, ans=0.125 2023-09-29 08:28:40,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:28:41,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 08:28:41,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 08:28:41,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:28:43,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=306906.6666666667, ans=0.04949747468305833 2023-09-29 08:28:44,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=306906.6666666667, ans=0.1 2023-09-29 08:28:45,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:28:45,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:47,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:49,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 08:28:50,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:52,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:54,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 08:28:56,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 08:29:00,274 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-09-29 08:29:00,986 INFO [train.py:1039] (2/4) Epoch 9, batch 3550, loss[loss=0.2042, simple_loss=0.2902, pruned_loss=0.0591, over 24044.00 frames. ], tot_loss[loss=0.2104, simple_loss=0.2784, pruned_loss=0.07115, over 4711462.30 frames. ], batch size: 80, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:29:01,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:02,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:29:02,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:04,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:05,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:29:13,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:14,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=306973.3333333333, ans=0.1 2023-09-29 08:29:15,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:29:17,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:18,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.87 vs. limit=10.0 2023-09-29 08:29:19,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:29:20,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:22,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:29:22,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:29:25,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:25,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:29:26,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:27,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:29:27,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:29:34,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:29:34,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:35,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:35,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:35,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:29:36,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 08:29:37,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:29:47,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:47,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:47,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:50,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 08:29:50,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:29:51,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 08:29:53,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:56,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:29:56,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:29:56,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=307173.3333333333, ans=0.2 2023-09-29 08:30:01,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 08:30:01,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:08,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:10,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 08:30:10,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:15,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:30:15,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 08:30:17,032 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.963e+02 2.242e+02 2.644e+02 4.260e+02, threshold=4.484e+02, percent-clipped=0.0 2023-09-29 08:30:21,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 08:30:22,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:30:22,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:30:23,424 INFO [train.py:1039] (2/4) Epoch 9, batch 3600, loss[loss=0.2229, simple_loss=0.2804, pruned_loss=0.08271, over 22827.00 frames. ], tot_loss[loss=0.2099, simple_loss=0.2776, pruned_loss=0.07108, over 4699308.76 frames. ], batch size: 322, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:30:24,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:25,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:26,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:30:29,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:31,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:32,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:30:33,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=307306.6666666667, ans=0.0 2023-09-29 08:30:34,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:30:36,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:36,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 08:30:40,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:30:40,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:43,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:46,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:46,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=307373.3333333333, ans=0.125 2023-09-29 08:30:50,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:30:50,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:50,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 08:30:50,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=307373.3333333333, ans=0.1 2023-09-29 08:30:50,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=307373.3333333333, ans=0.125 2023-09-29 08:30:51,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:54,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:56,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:30:57,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:59,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:59,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:00,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 08:31:02,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=307440.0, ans=0.2 2023-09-29 08:31:08,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:10,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:31:11,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=307506.6666666667, ans=0.125 2023-09-29 08:31:12,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 08:31:16,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:31:21,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=307506.6666666667, ans=0.1 2023-09-29 08:31:22,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:25,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:31,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:31:31,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:31:31,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 08:31:33,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 08:31:35,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 08:31:35,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=307573.3333333333, ans=0.2 2023-09-29 08:31:36,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:38,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:31:38,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 08:31:39,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:31:40,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:31:40,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:41,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 08:31:41,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 08:31:45,461 INFO [train.py:1039] (2/4) Epoch 9, batch 3650, loss[loss=0.184, simple_loss=0.2531, pruned_loss=0.05749, over 21600.00 frames. ], tot_loss[loss=0.2098, simple_loss=0.2778, pruned_loss=0.07093, over 4689387.18 frames. ], batch size: 47, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:31:45,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:47,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 08:31:49,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=307640.0, ans=0.0 2023-09-29 08:31:51,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 08:31:52,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:31:56,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 08:31:57,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 08:32:00,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:02,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:32:02,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:32:06,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:32:06,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=307706.6666666667, ans=10.0 2023-09-29 08:32:08,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:32:09,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 08:32:09,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:32:09,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:11,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 08:32:11,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:32:11,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:32:11,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:14,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:32:17,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 08:32:17,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 08:32:19,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:32:21,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 08:32:24,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:25,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:32:30,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:32:33,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:33,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:32:35,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:32:35,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:32:36,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:32:41,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:43,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:43,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:44,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:32:46,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:46,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:32:54,749 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 08:33:00,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:00,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:01,736 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.073e+02 2.361e+02 2.805e+02 4.754e+02, threshold=4.723e+02, percent-clipped=2.0 2023-09-29 08:33:01,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:33:01,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:03,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:33:05,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:07,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 08:33:07,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:08,644 INFO [train.py:1039] (2/4) Epoch 9, batch 3700, loss[loss=0.2518, simple_loss=0.3037, pruned_loss=0.09998, over 22824.00 frames. ], tot_loss[loss=0.2095, simple_loss=0.2778, pruned_loss=0.07058, over 4698218.84 frames. ], batch size: 322, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:33:10,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:33:11,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:33:11,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:33:12,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:12,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 08:33:12,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:13,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:33:13,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:33:13,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=307973.3333333333, ans=0.125 2023-09-29 08:33:17,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:33:18,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=307973.3333333333, ans=0.0 2023-09-29 08:33:19,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:19,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:21,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:33:21,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:22,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:33:24,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:26,635 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 08:33:27,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=308040.0, ans=0.125 2023-09-29 08:33:29,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=308040.0, ans=0.125 2023-09-29 08:33:35,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:33:35,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:33:38,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:33:38,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 08:33:40,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:43,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:44,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 08:33:46,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:47,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:33:50,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:50,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:33:52,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:33:55,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:55,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 08:33:57,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:57,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 08:34:03,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:34:04,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:34:07,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:09,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 08:34:12,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:34:12,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:34:13,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:13,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:16,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:18,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 08:34:19,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 08:34:19,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:34:19,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:22,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:34:24,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:34:25,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:34:27,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:34:27,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=308240.0, ans=0.1 2023-09-29 08:34:28,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:34:30,338 INFO [train.py:1039] (2/4) Epoch 9, batch 3750, loss[loss=0.2014, simple_loss=0.2788, pruned_loss=0.06203, over 24681.00 frames. ], tot_loss[loss=0.2118, simple_loss=0.28, pruned_loss=0.07177, over 4690703.83 frames. ], batch size: 65, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:34:30,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 08:34:30,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=308306.6666666667, ans=0.125 2023-09-29 08:34:32,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:34:33,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:34:36,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 08:34:36,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:34:38,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:38,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:39,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:34:42,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:34:48,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:34:49,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:34:49,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:54,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:34:54,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 08:34:57,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:34:59,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:34:59,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:35:02,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 08:35:06,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 08:35:06,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=308440.0, ans=0.125 2023-09-29 08:35:07,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:35:08,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:35:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:35:17,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:18,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:35:23,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 08:35:27,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:35:30,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:35:33,035 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.72 vs. limit=10.0 2023-09-29 08:35:33,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=308506.6666666667, ans=0.125 2023-09-29 08:35:35,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:35:39,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:35:41,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:35:43,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:35:45,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:35:46,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:35:48,988 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.167e+02 2.525e+02 3.168e+02 5.587e+02, threshold=5.051e+02, percent-clipped=3.0 2023-09-29 08:35:53,542 INFO [train.py:1039] (2/4) Epoch 9, batch 3800, loss[loss=0.2121, simple_loss=0.2999, pruned_loss=0.06219, over 24438.00 frames. ], tot_loss[loss=0.2126, simple_loss=0.28, pruned_loss=0.0726, over 4680687.58 frames. ], batch size: 69, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:35:57,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:36:01,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:02,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:36:02,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 08:36:04,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:07,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:08,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:36:09,744 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.58 vs. limit=8.0 2023-09-29 08:36:10,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 08:36:10,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:11,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:36:13,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:14,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:36:14,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:15,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=308706.6666666667, ans=0.125 2023-09-29 08:36:16,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 08:36:19,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 08:36:21,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:36:24,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:27,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:36:27,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:36:30,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:36:30,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:34,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:34,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:39,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:36:39,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 08:36:41,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:36:47,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:36:51,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=308840.0, ans=0.125 2023-09-29 08:36:53,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:36:55,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 08:36:57,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 08:36:59,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:00,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:37:02,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:03,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 08:37:07,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 08:37:07,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 08:37:09,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:09,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:37:15,127 INFO [train.py:1039] (2/4) Epoch 9, batch 3850, loss[loss=0.2098, simple_loss=0.2912, pruned_loss=0.06418, over 24300.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2788, pruned_loss=0.07207, over 4686019.83 frames. ], batch size: 74, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:37:15,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:37:16,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:37:21,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:37:22,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 08:37:24,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=15.0 2023-09-29 08:37:25,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:37:25,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:27,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=308973.3333333333, ans=0.0 2023-09-29 08:37:28,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:37:32,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:33,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:37:33,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 08:37:40,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:43,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:47,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:37:47,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:37:51,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:51,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:37:51,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:53,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:37:53,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:54,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:54,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:56,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:37:58,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 08:37:58,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 08:37:59,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:01,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:04,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:04,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=309173.3333333333, ans=0.125 2023-09-29 08:38:06,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:06,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 08:38:08,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 08:38:09,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:11,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 08:38:15,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:38:19,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:20,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:22,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=309240.0, ans=0.0 2023-09-29 08:38:25,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:25,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 08:38:28,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 08:38:30,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:31,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:33,503 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 1.974e+02 2.231e+02 2.579e+02 4.458e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 08:38:33,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:38:33,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:38:35,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:38:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 08:38:38,206 INFO [train.py:1039] (2/4) Epoch 9, batch 3900, loss[loss=0.1948, simple_loss=0.2287, pruned_loss=0.08049, over 19371.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2772, pruned_loss=0.07111, over 4677082.12 frames. ], batch size: 388, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:38:38,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:38,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 08:38:38,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:38,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:40,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:38:40,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:42,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:38:43,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:43,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:43,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:38:44,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 08:38:45,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:48,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:49,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:49,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=309306.6666666667, ans=0.035 2023-09-29 08:38:51,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:38:51,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:56,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:56,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:57,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:38:59,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 08:38:59,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:00,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=309373.3333333333, ans=0.125 2023-09-29 08:39:01,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 08:39:02,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:39:02,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 08:39:04,915 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.39 vs. limit=15.0 2023-09-29 08:39:06,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 08:39:06,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=309373.3333333333, ans=0.1 2023-09-29 08:39:09,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:10,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:39:10,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:39:12,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:17,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:19,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:39:19,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.20 vs. limit=15.0 2023-09-29 08:39:22,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:39:22,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:24,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:39:26,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=309440.0, ans=0.125 2023-09-29 08:39:30,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:30,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:39:34,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.67 vs. limit=6.0 2023-09-29 08:39:36,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:39:38,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:39:50,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:39:52,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:52,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 08:39:52,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 08:39:52,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:55,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 08:39:57,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:59,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 08:40:02,359 INFO [train.py:1039] (2/4) Epoch 9, batch 3950, loss[loss=0.2021, simple_loss=0.2851, pruned_loss=0.05952, over 24638.00 frames. ], tot_loss[loss=0.2096, simple_loss=0.277, pruned_loss=0.07113, over 4666938.82 frames. ], batch size: 68, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:40:05,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:40:07,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 08:40:07,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:40:10,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:40:10,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:40:15,753 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 08:40:17,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:17,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 08:40:18,761 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 08:40:18,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:40:22,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:23,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:40:23,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:26,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 08:40:28,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=309706.6666666667, ans=0.95 2023-09-29 08:40:30,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:40:31,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:31,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:40:31,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:40:33,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:40:33,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=309773.3333333333, ans=0.125 2023-09-29 08:40:34,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.08 vs. limit=15.0 2023-09-29 08:40:38,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=309773.3333333333, ans=0.0 2023-09-29 08:40:40,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=309773.3333333333, ans=0.1 2023-09-29 08:40:44,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=309773.3333333333, ans=0.0 2023-09-29 08:40:46,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:40:46,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:40:52,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 08:40:59,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 08:40:59,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 08:40:59,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:01,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:41:11,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:41:11,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:41:11,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:11,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:41:13,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 08:41:16,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:41:17,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:41:19,317 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.102e+02 2.264e+02 2.656e+02 4.963e+02, threshold=4.527e+02, percent-clipped=1.0 2023-09-29 08:41:22,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 08:41:23,811 INFO [train.py:1039] (2/4) Epoch 9, batch 4000, loss[loss=0.223, simple_loss=0.2989, pruned_loss=0.07357, over 24060.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.278, pruned_loss=0.07072, over 4688047.56 frames. ], batch size: 80, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:41:24,869 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.23 vs. limit=15.0 2023-09-29 08:41:32,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:32,872 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.91 vs. limit=15.0 2023-09-29 08:41:41,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:45,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:47,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:41:47,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:47,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 08:41:49,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:41:49,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 08:41:49,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:41:49,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 08:41:51,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=310040.0, ans=0.0 2023-09-29 08:41:52,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:55,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:41:55,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:41:55,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:57,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:57,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:41:59,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:41:59,403 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 08:41:59,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:42:00,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:04,695 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 08:42:04,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:42:04,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:08,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=310106.6666666667, ans=0.2 2023-09-29 08:42:13,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 08:42:13,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:42:15,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:42:16,733 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 08:42:18,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:42:19,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 08:42:19,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:42:19,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:21,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:42:23,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:42:24,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:42:25,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:25,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 08:42:25,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=310173.3333333333, ans=0.2 2023-09-29 08:42:25,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=310173.3333333333, ans=0.125 2023-09-29 08:42:26,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:28,126 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 08:42:32,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:42:37,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:42:38,669 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=16.45 vs. limit=15.0 2023-09-29 08:42:39,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:42:40,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:40,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:42:42,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:42:47,925 INFO [train.py:1039] (2/4) Epoch 9, batch 4050, loss[loss=0.2062, simple_loss=0.2835, pruned_loss=0.06443, over 24459.00 frames. ], tot_loss[loss=0.2112, simple_loss=0.279, pruned_loss=0.07176, over 4683647.73 frames. ], batch size: 63, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:42:48,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:49,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:42:51,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 08:42:52,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:42:52,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:42:54,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:42:54,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:42:54,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=310306.6666666667, ans=0.1 2023-09-29 08:42:55,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:43:01,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:43:02,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=310373.3333333333, ans=0.125 2023-09-29 08:43:04,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:05,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:43:07,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:43:08,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:43:11,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:15,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:43:17,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 08:43:19,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 08:43:21,192 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 08:43:22,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:43:29,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 08:43:30,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:43:34,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:37,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:38,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:43:40,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:40,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=310506.6666666667, ans=0.0 2023-09-29 08:43:41,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:43,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=310506.6666666667, ans=0.1 2023-09-29 08:43:46,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 08:43:46,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:43:47,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:43:50,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 08:43:56,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:43:58,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=310573.3333333333, ans=0.125 2023-09-29 08:44:02,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 08:44:04,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:04,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:44:05,845 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.021e+02 2.194e+02 2.667e+02 4.003e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-29 08:44:06,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 08:44:06,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 08:44:06,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:09,638 INFO [train.py:1039] (2/4) Epoch 9, batch 4100, loss[loss=0.21, simple_loss=0.2874, pruned_loss=0.0663, over 23978.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2796, pruned_loss=0.07149, over 4697996.22 frames. ], batch size: 80, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:44:09,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:09,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:09,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:44:17,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 08:44:19,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 08:44:19,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=310640.0, ans=0.125 2023-09-29 08:44:22,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 08:44:23,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 08:44:23,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:25,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:44:26,744 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 08:44:30,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:31,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:44:31,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:33,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:44:35,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:44:36,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:36,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:44:36,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 08:44:38,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:38,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:44:38,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:38,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:44:38,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 08:44:41,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:44:45,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 08:44:46,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:46,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=310773.3333333333, ans=0.125 2023-09-29 08:44:49,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:49,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 08:44:53,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:53,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:44:53,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:44:56,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 08:44:58,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:44:58,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:45:00,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 08:45:01,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:01,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:01,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=310840.0, ans=0.125 2023-09-29 08:45:04,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:09,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:12,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.51 vs. limit=22.5 2023-09-29 08:45:14,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:14,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:45:17,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=310906.6666666667, ans=0.0 2023-09-29 08:45:22,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:22,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:28,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:31,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:45:33,057 INFO [train.py:1039] (2/4) Epoch 9, batch 4150, loss[loss=0.2304, simple_loss=0.3016, pruned_loss=0.07957, over 24319.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.279, pruned_loss=0.07186, over 4693425.07 frames. ], batch size: 77, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:45:33,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:34,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:45:34,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:45:34,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:38,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 08:45:38,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:40,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 08:45:40,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 08:45:41,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 08:45:41,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:43,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=310973.3333333333, ans=0.125 2023-09-29 08:45:46,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=310973.3333333333, ans=0.1 2023-09-29 08:45:47,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:45:47,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:48,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=311040.0, ans=0.125 2023-09-29 08:45:49,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=311040.0, ans=0.07 2023-09-29 08:45:52,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:53,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:45:54,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:45:57,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:45:57,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:58,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:46:04,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:08,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:10,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 08:46:12,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 08:46:12,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:46:12,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=311106.6666666667, ans=0.125 2023-09-29 08:46:13,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 08:46:13,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:46:13,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:17,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:17,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:21,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 08:46:26,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:26,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:46:27,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 08:46:28,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:30,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 08:46:31,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:46:34,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:36,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:38,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 08:46:38,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:46:38,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:46:39,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:46:41,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 08:46:43,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:43,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:46:43,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:46:43,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 08:46:43,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:44,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=311240.0, ans=0.2 2023-09-29 08:46:45,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:46:46,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:46,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:46,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 08:46:48,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:50,196 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:46:51,306 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.017e+02 2.325e+02 2.759e+02 4.576e+02, threshold=4.650e+02, percent-clipped=1.0 2023-09-29 08:46:54,451 INFO [train.py:1039] (2/4) Epoch 9, batch 4200, loss[loss=0.2276, simple_loss=0.2825, pruned_loss=0.08637, over 23857.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2776, pruned_loss=0.07153, over 4694389.67 frames. ], batch size: 195, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:46:54,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:46:56,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 08:46:57,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:46:59,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:47:00,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:47:02,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:02,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:05,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 08:47:09,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 08:47:10,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:12,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:16,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:47:18,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=311373.3333333333, ans=0.1 2023-09-29 08:47:21,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:47:21,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:21,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:22,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 08:47:23,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:24,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:24,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:47:24,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:47:26,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:47:27,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 08:47:28,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:29,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=311440.0, ans=0.125 2023-09-29 08:47:32,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:47:32,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:47:36,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:47:37,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:47:40,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:47:40,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 08:47:40,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:47:42,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:47:47,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:47:47,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=311506.6666666667, ans=0.125 2023-09-29 08:47:50,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:53,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:47:58,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 08:48:00,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:04,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:48:05,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.whiten.whitening_limit, batch_count=311573.3333333333, ans=12.0 2023-09-29 08:48:06,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:07,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 08:48:14,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:48:18,103 INFO [train.py:1039] (2/4) Epoch 9, batch 4250, loss[loss=0.2149, simple_loss=0.2845, pruned_loss=0.07264, over 24346.00 frames. ], tot_loss[loss=0.2093, simple_loss=0.2766, pruned_loss=0.07096, over 4677003.64 frames. ], batch size: 61, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:48:20,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:48:20,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:48:22,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:28,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:48:28,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 08:48:28,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:48:31,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.91 vs. limit=12.0 2023-09-29 08:48:33,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:36,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:48:38,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=311706.6666666667, ans=0.2 2023-09-29 08:48:38,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=311706.6666666667, ans=0.0 2023-09-29 08:48:41,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:41,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:42,083 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.99 vs. limit=15.0 2023-09-29 08:48:44,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:48:44,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:48:45,103 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.31 vs. limit=15.0 2023-09-29 08:48:45,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:46,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:48,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:49,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:48:51,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:54,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 08:48:57,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 08:48:57,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:59,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:48:59,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=311773.3333333333, ans=0.125 2023-09-29 08:49:00,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:49:00,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:49:00,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:00,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:49:05,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:49:05,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:49:07,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=311840.0, ans=0.125 2023-09-29 08:49:11,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:12,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:13,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 08:49:13,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:49:14,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 08:49:16,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:49:19,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:49:20,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:20,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:49:20,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=311840.0, ans=0.0 2023-09-29 08:49:22,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 08:49:24,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:49:24,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:49:25,421 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:49:28,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:32,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:33,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:49:33,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:35,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:36,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:49:38,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:49:38,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 08:49:39,724 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.120e+02 2.435e+02 2.958e+02 4.592e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-29 08:49:40,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:40,679 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.27 vs. limit=12.0 2023-09-29 08:49:41,327 INFO [train.py:1039] (2/4) Epoch 9, batch 4300, loss[loss=0.1971, simple_loss=0.2663, pruned_loss=0.06393, over 23240.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2772, pruned_loss=0.07156, over 4675356.04 frames. ], batch size: 93, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:49:46,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:46,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:49:51,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:59,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:59,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 08:50:01,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:50:05,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:50:05,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:50:05,406 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 08:50:08,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:50:08,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=312040.0, ans=0.125 2023-09-29 08:50:10,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:13,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 08:50:14,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:50:14,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 08:50:16,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:50:17,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:50:20,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:50:20,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:50:22,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:50:24,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:25,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:50:25,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 08:50:25,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 08:50:28,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:50:29,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=312173.3333333333, ans=0.2 2023-09-29 08:50:30,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:50:30,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:31,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 08:50:31,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 08:50:32,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 08:50:34,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:50:36,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 08:50:36,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 08:50:40,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:42,271 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 08:50:42,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=312173.3333333333, ans=0.2 2023-09-29 08:50:43,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:50:46,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:46,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:48,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 08:50:49,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:49,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:49,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:50:49,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:50:51,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:50:54,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:50:57,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:57,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:58,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:51:02,100 INFO [train.py:1039] (2/4) Epoch 9, batch 4350, loss[loss=0.2235, simple_loss=0.2946, pruned_loss=0.07623, over 24078.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2783, pruned_loss=0.07194, over 4688275.30 frames. ], batch size: 80, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:51:03,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 08:51:03,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:51:09,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:12,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:12,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=312306.6666666667, ans=0.1 2023-09-29 08:51:15,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:51:15,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:51:20,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:51:25,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:26,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:51:26,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:51:29,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:51:30,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=312373.3333333333, ans=0.04949747468305833 2023-09-29 08:51:31,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:51:33,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:51:39,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 08:51:39,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:39,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=312440.0, ans=0.0 2023-09-29 08:51:41,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:45,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=312440.0, ans=0.125 2023-09-29 08:51:46,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:47,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.59 vs. limit=15.0 2023-09-29 08:51:49,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 08:51:51,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:51:54,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:51:59,119 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 08:52:00,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:00,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:52:02,256 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 08:52:02,363 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 08:52:02,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:03,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:03,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:52:05,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:07,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:07,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:08,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 08:52:08,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:08,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:10,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:10,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 08:52:11,780 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 08:52:11,787 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 08:52:11,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 08:52:15,787 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.914e-02 2023-09-29 08:52:17,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:52:17,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:52:17,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:19,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:52:20,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 08:52:20,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=312573.3333333333, ans=0.1 2023-09-29 08:52:22,659 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.989e+02 2.158e+02 2.516e+02 4.089e+02, threshold=4.315e+02, percent-clipped=0.0 2023-09-29 08:52:22,926 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 08:52:22,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:24,244 INFO [train.py:1039] (2/4) Epoch 9, batch 4400, loss[loss=0.2136, simple_loss=0.295, pruned_loss=0.06612, over 24637.00 frames. ], tot_loss[loss=0.2117, simple_loss=0.2789, pruned_loss=0.07231, over 4686139.37 frames. ], batch size: 68, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:52:26,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=312640.0, ans=0.125 2023-09-29 08:52:29,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:29,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:30,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:33,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 08:52:33,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 08:52:33,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 08:52:33,868 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 08:52:35,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:52:35,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:38,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 08:52:41,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:42,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:44,169 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 08:52:47,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:47,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 08:52:48,031 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 08:52:51,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 08:52:53,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 08:52:53,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 08:52:53,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:53,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:53,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:54,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:57,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 08:52:57,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 08:52:57,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:53:02,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:53:02,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:02,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:03,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:53:03,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 08:53:05,422 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 08:53:05,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=312773.3333333333, ans=0.0 2023-09-29 08:53:08,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:15,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:53:17,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 08:53:21,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:53:25,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:27,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:53:29,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 08:53:29,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:53:30,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:53:30,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:53:30,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:53:33,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 08:53:38,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 08:53:39,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 08:53:39,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:39,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 08:53:40,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:53:43,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:53:45,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 08:53:46,749 INFO [train.py:1039] (2/4) Epoch 9, batch 4450, loss[loss=0.2271, simple_loss=0.2927, pruned_loss=0.08072, over 23503.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.2795, pruned_loss=0.07258, over 4696761.52 frames. ], batch size: 93, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:53:50,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:53,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:53,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:53:54,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.19 vs. limit=6.0 2023-09-29 08:53:59,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:53:59,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:54:06,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:08,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:54:11,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:54:11,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:13,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 08:54:13,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:13,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:14,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:14,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:54:16,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:54:21,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:22,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:25,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:25,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:25,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:54:30,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:54:32,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 08:54:32,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 08:54:32,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:54:35,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:39,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 08:54:42,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:54:45,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:45,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=313173.3333333333, ans=0.125 2023-09-29 08:54:46,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 08:54:46,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:46,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:46,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:54:46,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:49,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:50,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=313240.0, ans=0.0 2023-09-29 08:54:52,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:54:52,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 08:54:55,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:54:55,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:56,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:58,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:58,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:55:00,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:55:03,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 08:55:05,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:55:06,704 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.016e+02 2.496e+02 2.860e+02 5.111e+02, threshold=4.992e+02, percent-clipped=1.0 2023-09-29 08:55:08,160 INFO [train.py:1039] (2/4) Epoch 9, batch 4500, loss[loss=0.1985, simple_loss=0.2539, pruned_loss=0.07151, over 23341.00 frames. ], tot_loss[loss=0.2132, simple_loss=0.2807, pruned_loss=0.07292, over 4696060.96 frames. ], batch size: 285, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:55:09,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:12,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 08:55:12,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 08:55:13,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:20,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:55:20,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:20,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:55:22,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:55:22,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:23,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:28,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=313373.3333333333, ans=0.125 2023-09-29 08:55:34,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:36,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:55:36,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=313373.3333333333, ans=0.0 2023-09-29 08:55:38,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:55:40,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:55:41,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:55:49,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:55:53,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:55:59,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:56:02,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:56:02,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 08:56:02,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:04,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:56:08,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:56:09,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 08:56:09,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:56:09,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:16,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:56:16,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:56:17,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:20,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:56:20,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:56:22,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 08:56:25,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 08:56:25,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 08:56:30,943 INFO [train.py:1039] (2/4) Epoch 9, batch 4550, loss[loss=0.1989, simple_loss=0.2409, pruned_loss=0.07843, over 19341.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2796, pruned_loss=0.07264, over 4698891.40 frames. ], batch size: 388, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:56:31,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 08:56:34,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 08:56:35,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:36,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=313640.0, ans=0.125 2023-09-29 08:56:37,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:38,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:41,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.02 vs. limit=22.5 2023-09-29 08:56:41,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:45,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:56:47,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:48,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:56:50,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:56:50,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:52,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:52,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:59,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:57:00,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 08:57:00,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 08:57:02,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:57:04,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 08:57:06,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 08:57:06,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:10,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 08:57:12,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:57:14,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:16,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:16,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:57:19,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 08:57:22,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:24,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:25,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:26,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:27,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 08:57:29,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 08:57:29,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:57:31,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 08:57:34,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 08:57:34,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:36,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:36,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:57:36,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=313906.6666666667, ans=0.0 2023-09-29 08:57:37,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:37,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:57:39,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:57:39,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 08:57:42,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:42,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 08:57:44,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 08:57:44,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:57:44,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 08:57:47,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:57:47,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:57:51,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:57:51,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=313906.6666666667, ans=0.2 2023-09-29 08:57:52,527 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.108e+02 2.512e+02 3.014e+02 4.343e+02, threshold=5.024e+02, percent-clipped=0.0 2023-09-29 08:57:52,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:52,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:57:54,158 INFO [train.py:1039] (2/4) Epoch 9, batch 4600, loss[loss=0.2248, simple_loss=0.2882, pruned_loss=0.08073, over 23809.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2771, pruned_loss=0.072, over 4694760.76 frames. ], batch size: 164, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:57:54,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:57:55,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:57:57,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:59,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:58:01,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=313973.3333333333, ans=0.125 2023-09-29 08:58:02,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:58:02,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:58:04,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:06,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 08:58:07,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:58:11,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:58:11,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:15,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:21,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=314040.0, ans=0.125 2023-09-29 08:58:22,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 08:58:24,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:26,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:29,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:58:29,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:29,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=314106.6666666667, ans=0.0 2023-09-29 08:58:31,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.82 vs. limit=15.0 2023-09-29 08:58:35,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 08:58:35,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:58:37,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:58:41,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:42,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:58:44,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:58:47,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 08:58:49,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:58:54,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:55,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:58:56,198 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:58:58,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:58,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:58:58,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:58,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 08:58:59,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:59,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:01,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:02,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:59:03,536 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.45 vs. limit=15.0 2023-09-29 08:59:04,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:04,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=314240.0, ans=0.0 2023-09-29 08:59:05,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 08:59:05,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 08:59:05,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 08:59:07,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:07,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.37 vs. limit=15.0 2023-09-29 08:59:09,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:09,716 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:59:11,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:11,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:11,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=314240.0, ans=0.125 2023-09-29 08:59:17,962 INFO [train.py:1039] (2/4) Epoch 9, batch 4650, loss[loss=0.2075, simple_loss=0.2851, pruned_loss=0.06495, over 24014.00 frames. ], tot_loss[loss=0.2091, simple_loss=0.2763, pruned_loss=0.0709, over 4705495.94 frames. ], batch size: 80, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:59:21,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:59:26,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:26,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:26,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:59:26,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:27,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:29,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:31,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 08:59:36,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:59:37,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 08:59:37,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:39,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 08:59:39,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:59:39,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 08:59:40,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 08:59:40,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:40,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:59:44,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:59:46,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:46,707 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 08:59:49,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:51,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 08:59:54,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:54,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:59:55,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 08:59:57,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:59:57,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=314440.0, ans=0.125 2023-09-29 09:00:02,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:00:05,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:06,316 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.99 vs. limit=22.5 2023-09-29 09:00:10,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:13,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:00:17,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 09:00:18,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 09:00:19,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 09:00:19,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 09:00:20,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:27,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:00:27,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:27,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 09:00:27,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:29,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=314573.3333333333, ans=0.1 2023-09-29 09:00:30,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:30,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:00:30,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:00:34,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:00:34,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:34,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=314573.3333333333, ans=0.2 2023-09-29 09:00:35,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:38,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:38,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:00:38,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:00:40,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 2.149e+02 2.483e+02 2.991e+02 4.624e+02, threshold=4.965e+02, percent-clipped=0.0 2023-09-29 09:00:40,308 INFO [train.py:1039] (2/4) Epoch 9, batch 4700, loss[loss=0.2191, simple_loss=0.2813, pruned_loss=0.07843, over 23468.00 frames. ], tot_loss[loss=0.209, simple_loss=0.2767, pruned_loss=0.07063, over 4717901.41 frames. ], batch size: 134, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:00:40,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:00:41,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:00:42,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 09:00:50,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:52,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:52,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:00:53,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:55,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=314706.6666666667, ans=0.1 2023-09-29 09:00:57,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:01:01,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 09:01:03,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 09:01:05,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:06,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:01:06,817 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.05 vs. limit=22.5 2023-09-29 09:01:08,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:01:11,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:13,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=314773.3333333333, ans=0.0 2023-09-29 09:01:16,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:01:17,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 09:01:21,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:01:26,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 09:01:28,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:01:30,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:32,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=314840.0, ans=0.125 2023-09-29 09:01:35,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 09:01:35,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:01:35,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=314840.0, ans=0.125 2023-09-29 09:01:41,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:01:41,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 09:01:41,580 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:01:42,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:42,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:45,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=314906.6666666667, ans=0.125 2023-09-29 09:01:46,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:46,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:01:47,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 09:01:48,035 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 09:01:50,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:53,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 09:01:55,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:58,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 09:02:00,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=314906.6666666667, ans=0.125 2023-09-29 09:02:01,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:02:03,560 INFO [train.py:1039] (2/4) Epoch 9, batch 4750, loss[loss=0.2028, simple_loss=0.2747, pruned_loss=0.0654, over 23310.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2786, pruned_loss=0.07175, over 4695164.39 frames. ], batch size: 93, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:02:03,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:03,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=314973.3333333333, ans=0.125 2023-09-29 09:02:08,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:09,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:02:10,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=314973.3333333333, ans=0.0 2023-09-29 09:02:11,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 09:02:11,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:14,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 09:02:16,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:02:16,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:02:17,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:24,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 09:02:26,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=315040.0, ans=0.125 2023-09-29 09:02:27,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:02:30,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 09:02:31,260 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.02 vs. limit=10.0 2023-09-29 09:02:31,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:36,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:38,208 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 09:02:38,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 09:02:43,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 09:02:46,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:48,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:02:48,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=315106.6666666667, ans=0.125 2023-09-29 09:02:50,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:02:50,163 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 09:02:50,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:02:54,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:02:58,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:03:00,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 09:03:01,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 09:03:01,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:01,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:03:03,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:03,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:03:05,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 09:03:07,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 09:03:10,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:11,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:03:11,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 09:03:11,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:13,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:15,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:03:15,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=315240.0, ans=0.125 2023-09-29 09:03:17,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:17,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:03:17,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=315240.0, ans=0.0 2023-09-29 09:03:20,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:20,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 09:03:21,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 09:03:23,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 09:03:26,101 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.000e+02 2.225e+02 2.502e+02 3.899e+02, threshold=4.449e+02, percent-clipped=0.0 2023-09-29 09:03:26,146 INFO [train.py:1039] (2/4) Epoch 9, batch 4800, loss[loss=0.2012, simple_loss=0.2714, pruned_loss=0.06551, over 21837.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2786, pruned_loss=0.07121, over 4718132.00 frames. ], batch size: 48, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:03:26,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:03:26,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:26,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=315306.6666666667, ans=0.125 2023-09-29 09:03:27,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 09:03:28,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=315306.6666666667, ans=0.125 2023-09-29 09:03:29,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=315306.6666666667, ans=0.125 2023-09-29 09:03:35,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:36,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:37,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=315306.6666666667, ans=0.0 2023-09-29 09:03:38,369 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.96 vs. limit=22.5 2023-09-29 09:03:42,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:03:43,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:43,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:45,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 09:03:45,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:45,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:03:48,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:03:53,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:56,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:56,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:03:56,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=315373.3333333333, ans=0.1 2023-09-29 09:03:57,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:57,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:03:57,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:59,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:03,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:05,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:04:08,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:04:10,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:10,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 09:04:10,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=315440.0, ans=0.125 2023-09-29 09:04:12,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 09:04:12,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:12,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:04:13,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:04:13,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:13,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:04:15,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:04:17,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:19,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:24,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:25,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:29,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 09:04:29,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:30,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:30,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:04:32,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:33,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=315573.3333333333, ans=0.1 2023-09-29 09:04:35,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:36,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:04:36,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:36,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:04:36,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:04:37,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.01 vs. limit=6.0 2023-09-29 09:04:38,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:04:42,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:42,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:42,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:44,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 09:04:48,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 09:04:48,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:48,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:49,615 INFO [train.py:1039] (2/4) Epoch 9, batch 4850, loss[loss=0.219, simple_loss=0.2962, pruned_loss=0.07093, over 24394.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2795, pruned_loss=0.07161, over 4719579.15 frames. ], batch size: 77, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:04:49,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:04:49,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:52,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:56,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=315640.0, ans=0.125 2023-09-29 09:05:00,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 09:05:03,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:05,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=315706.6666666667, ans=0.2 2023-09-29 09:05:08,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:08,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:05:08,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:13,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:13,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:05:14,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:05:14,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 09:05:20,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:05:22,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:05:22,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:05:22,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:05:22,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 09:05:22,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=315773.3333333333, ans=0.125 2023-09-29 09:05:26,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:26,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:31,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:31,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 09:05:32,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 09:05:32,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:05:38,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.79 vs. limit=10.0 2023-09-29 09:05:40,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:05:41,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 09:05:42,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:05:42,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:05:43,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:05:45,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 09:05:45,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:46,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 09:05:48,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:50,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:05:50,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 09:05:59,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:06:04,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:06:04,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:09,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 09:06:09,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:06:12,329 INFO [train.py:1039] (2/4) Epoch 9, batch 4900, loss[loss=0.2124, simple_loss=0.2512, pruned_loss=0.08683, over 19139.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2784, pruned_loss=0.07132, over 4717526.97 frames. ], batch size: 388, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:06:13,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.067e+02 2.446e+02 3.189e+02 7.103e+02, threshold=4.893e+02, percent-clipped=2.0 2023-09-29 09:06:14,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:15,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:15,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:06:19,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 09:06:22,940 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:06:24,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 09:06:28,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 09:06:31,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 09:06:31,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:33,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:33,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:06:33,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:33,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:06:34,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 09:06:36,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 09:06:37,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:06:39,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:06:41,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:44,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:06:44,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:46,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:06:46,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 09:06:47,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:06:49,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:50,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 09:06:50,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 09:06:51,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=316106.6666666667, ans=0.125 2023-09-29 09:06:54,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 09:06:54,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:06:56,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:06:56,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=316106.6666666667, ans=0.125 2023-09-29 09:06:57,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:06:57,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:57,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:06:57,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:06:59,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 09:07:04,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:06,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:07:07,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:07:10,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 09:07:11,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=316173.3333333333, ans=0.0 2023-09-29 09:07:12,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:07:13,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:07:14,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 09:07:19,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:22,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:07:23,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 09:07:23,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:23,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:07:25,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:30,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:30,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:07:30,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:30,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 09:07:32,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:07:32,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=316240.0, ans=0.125 2023-09-29 09:07:35,207 INFO [train.py:1039] (2/4) Epoch 9, batch 4950, loss[loss=0.2002, simple_loss=0.2616, pruned_loss=0.06941, over 23545.00 frames. ], tot_loss[loss=0.2094, simple_loss=0.2771, pruned_loss=0.0708, over 4719497.86 frames. ], batch size: 256, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:07:37,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:37,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:39,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 09:07:41,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 09:07:41,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:07:42,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 09:07:42,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:42,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:07:44,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:07:44,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:07:47,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:49,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:07:49,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:07:50,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:50,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:52,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:55,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:08:00,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:01,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:08:03,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:05,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:07,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:08:07,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 09:08:08,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 09:08:10,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:13,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:08:13,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:08:16,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:08:16,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:08:18,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:08:20,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:25,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:08:26,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:08:28,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:28,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:30,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 09:08:30,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:08:31,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:08:32,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=316506.6666666667, ans=0.125 2023-09-29 09:08:34,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:08:35,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:08:35,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:08:36,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:38,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:08:40,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:08:41,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:08:41,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:08:42,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=316573.3333333333, ans=0.1 2023-09-29 09:08:43,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:43,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 09:08:49,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:08:53,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 09:08:53,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:08:56,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=316573.3333333333, ans=0.125 2023-09-29 09:08:59,028 INFO [train.py:1039] (2/4) Epoch 9, batch 5000, loss[loss=0.2165, simple_loss=0.2757, pruned_loss=0.07867, over 23646.00 frames. ], tot_loss[loss=0.2091, simple_loss=0.2769, pruned_loss=0.07062, over 4719282.58 frames. ], batch size: 232, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:09:00,623 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.030e+02 2.416e+02 2.948e+02 4.844e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 09:09:00,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:00,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:02,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 09:09:03,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 09:09:05,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:09:06,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 09:09:07,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:09:07,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:09:08,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 09:09:08,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:08,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:10,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 09:09:10,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:10,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:13,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 09:09:13,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 09:09:15,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:09:15,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 09:09:15,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:09:16,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:16,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:09:16,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 09:09:16,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 09:09:20,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 09:09:20,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:20,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:22,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 09:09:22,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:24,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:25,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:28,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:09:28,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 09:09:30,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:09:31,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:09:37,792 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 09:09:39,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:41,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:41,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:09:47,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 09:09:47,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:47,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:49,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:50,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 09:09:50,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:54,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:54,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=316840.0, ans=0.125 2023-09-29 09:09:55,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:09:56,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=316840.0, ans=0.2 2023-09-29 09:10:02,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 09:10:07,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:16,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:10:19,283 INFO [train.py:1039] (2/4) Epoch 9, batch 5050, loss[loss=0.2057, simple_loss=0.2909, pruned_loss=0.06024, over 24299.00 frames. ], tot_loss[loss=0.2094, simple_loss=0.2776, pruned_loss=0.07067, over 4731933.64 frames. ], batch size: 74, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:10:19,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:19,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:10:19,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:19,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:10:19,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:10:19,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:24,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:24,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 09:10:27,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:10:30,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:31,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:10:32,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 09:10:33,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:34,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:10:37,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:10:39,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:10:41,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:10:49,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 09:10:50,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:10:50,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:10:52,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 09:10:52,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:10:52,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=317106.6666666667, ans=0.125 2023-09-29 09:10:53,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:53,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:55,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:10:55,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 09:10:55,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 09:10:56,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:57,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=317106.6666666667, ans=0.125 2023-09-29 09:10:59,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:01,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=317106.6666666667, ans=0.0 2023-09-29 09:11:02,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:11:02,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 09:11:03,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:05,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=317106.6666666667, ans=0.0 2023-09-29 09:11:06,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 09:11:07,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=317173.3333333333, ans=0.2 2023-09-29 09:11:09,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:11:09,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:11:11,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:12,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:11:14,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:11:17,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:11:19,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:19,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:11:19,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:11:20,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 09:11:20,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:11:21,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=317173.3333333333, ans=0.0 2023-09-29 09:11:23,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:11:27,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:27,184 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 09:11:27,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:11:28,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:11:30,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:31,669 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 09:11:35,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:35,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 09:11:35,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:40,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:40,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=317306.6666666667, ans=0.0 2023-09-29 09:11:41,503 INFO [train.py:1039] (2/4) Epoch 9, batch 5100, loss[loss=0.2209, simple_loss=0.2936, pruned_loss=0.07409, over 23439.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2787, pruned_loss=0.07089, over 4728449.14 frames. ], batch size: 93, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:11:41,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:41,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 09:11:43,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.032e+02 2.319e+02 2.659e+02 3.992e+02, threshold=4.638e+02, percent-clipped=0.0 2023-09-29 09:11:43,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 09:11:45,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:45,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:11:46,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:11:49,297 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 09:11:52,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:55,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 09:11:57,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 09:11:57,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:59,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:12:02,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:12:02,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 09:12:02,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 09:12:06,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:12:08,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:12:12,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:12:15,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 09:12:15,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:17,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:12:17,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 09:12:19,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 09:12:24,051 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 09:12:24,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:24,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 09:12:24,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 09:12:28,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:38,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:12:41,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 09:12:43,116 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 09:12:43,129 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 09:12:45,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 09:12:45,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:47,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 09:12:51,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 09:12:54,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:12:56,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:12:57,975 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:12:59,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 09:13:00,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:13:00,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 09:13:05,220 INFO [train.py:1039] (2/4) Epoch 9, batch 5150, loss[loss=0.2212, simple_loss=0.2967, pruned_loss=0.07288, over 24349.00 frames. ], tot_loss[loss=0.2118, simple_loss=0.2803, pruned_loss=0.07162, over 4727542.52 frames. ], batch size: 77, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:13:06,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:13:06,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:13:06,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:13:08,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:13:08,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:13:10,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:13:10,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 09:13:10,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 09:13:12,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 09:13:13,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:13:13,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 09:13:13,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:14,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=317640.0, ans=0.0 2023-09-29 09:13:15,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:13:17,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:19,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:19,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=317640.0, ans=0.0 2023-09-29 09:13:22,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=317706.6666666667, ans=0.1 2023-09-29 09:13:23,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:13:23,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 09:13:25,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:26,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:13:29,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:13:29,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:29,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:29,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:13:29,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:13:31,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 09:13:32,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:13:32,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:13:34,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:13:37,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 09:13:37,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:13:43,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:13:45,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 09:13:48,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:54,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:56,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:14:01,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:01,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:04,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 09:14:07,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:14:08,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:14:09,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:14:13,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:13,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=317906.6666666667, ans=0.125 2023-09-29 09:14:15,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:16,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 09:14:20,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:14:20,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=317906.6666666667, ans=0.0 2023-09-29 09:14:23,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:14:25,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:14:25,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:14:25,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:14:25,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:14:26,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:14:26,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:14:27,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=317973.3333333333, ans=0.1 2023-09-29 09:14:28,340 INFO [train.py:1039] (2/4) Epoch 9, batch 5200, loss[loss=0.2076, simple_loss=0.2893, pruned_loss=0.06293, over 24563.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.28, pruned_loss=0.07134, over 4729014.34 frames. ], batch size: 71, lr: 1.12e-02, grad_scale: 16.0 2023-09-29 09:14:29,847 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.093e+02 2.397e+02 2.811e+02 4.237e+02, threshold=4.795e+02, percent-clipped=0.0 2023-09-29 09:14:31,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:14:32,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:14:36,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:40,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 09:14:40,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:14:40,670 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=317973.3333333333, ans=0.125 2023-09-29 09:14:41,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:44,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:45,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:14:45,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:48,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 09:14:49,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:14:51,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:53,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 09:14:55,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:14:55,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:14:56,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 09:14:56,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 09:14:59,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 09:15:01,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:01,162 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 09:15:01,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:15:02,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:02,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:15:04,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 09:15:05,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:08,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:11,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 09:15:11,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 09:15:11,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=318106.6666666667, ans=0.1 2023-09-29 09:15:12,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 09:15:17,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 09:15:17,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:15:25,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:15:25,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:27,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 09:15:27,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:27,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:15:27,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:29,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:15:33,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:33,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:15:38,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:40,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:40,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:42,428 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.78 vs. limit=15.0 2023-09-29 09:15:47,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:47,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 09:15:47,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:48,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:15:49,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:49,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:15:50,531 INFO [train.py:1039] (2/4) Epoch 9, batch 5250, loss[loss=0.2327, simple_loss=0.294, pruned_loss=0.08568, over 23362.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2794, pruned_loss=0.07116, over 4728103.02 frames. ], batch size: 119, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:15:50,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:15:52,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:55,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:55,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:15:57,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:16:04,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:16:04,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:16:07,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:16:08,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:16:10,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 09:16:11,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:16:12,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:16:14,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=318373.3333333333, ans=0.125 2023-09-29 09:16:14,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=318373.3333333333, ans=0.125 2023-09-29 09:16:34,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.04 vs. limit=15.0 2023-09-29 09:16:35,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=318506.6666666667, ans=0.1 2023-09-29 09:16:50,118 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.78 vs. limit=22.5 2023-09-29 09:17:03,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=318640.0, ans=0.0 2023-09-29 09:17:04,980 INFO [train.py:1039] (2/4) Epoch 9, batch 5300, loss[loss=0.1871, simple_loss=0.2667, pruned_loss=0.05376, over 24669.00 frames. ], tot_loss[loss=0.2093, simple_loss=0.2775, pruned_loss=0.07052, over 4708010.95 frames. ], batch size: 65, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:17:07,704 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.994e+02 2.180e+02 2.520e+02 5.243e+02, threshold=4.360e+02, percent-clipped=1.0 2023-09-29 09:17:20,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:17:20,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 09:17:20,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 09:17:20,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:20,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:20,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:20,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:20,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:21,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:17:21,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:21,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:17:21,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:17:21,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 09:17:22,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 09:17:22,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 09:17:22,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:17:22,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 09:17:22,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 09:17:22,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:23,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:23,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:23,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:23,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:17:24,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:24,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:24,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:24,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:24,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:24,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:17:24,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:24,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:17:25,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 09:17:25,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:26,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:26,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 09:17:26,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 09:17:26,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:17:26,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:17:26,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 09:17:26,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 09:17:27,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:27,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:17:27,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:28,074 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 09:17:28,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 09:17:28,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:17:28,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:28,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 09:17:28,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 09:17:29,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 09:17:29,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:39,050 INFO [train.py:1039] (2/4) Epoch 10, batch 0, loss[loss=0.2132, simple_loss=0.2805, pruned_loss=0.07295, over 23487.00 frames. ], tot_loss[loss=0.2132, simple_loss=0.2805, pruned_loss=0.07295, over 23487.00 frames. ], batch size: 119, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:17:39,051 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 09:17:52,998 INFO [train.py:1071] (2/4) Epoch 10, validation: loss=0.3048, simple_loss=0.281, pruned_loss=0.1643, over 1125622.00 frames. 2023-09-29 09:17:52,999 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 09:17:56,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 09:17:56,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:17:58,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:18:03,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:03,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:18:05,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:06,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 09:18:08,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 09:18:09,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:09,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:11,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=318786.6666666667, ans=0.1 2023-09-29 09:18:13,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:13,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:13,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:18:13,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:14,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 09:18:17,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:20,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=318786.6666666667, ans=0.125 2023-09-29 09:18:22,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=318786.6666666667, ans=0.125 2023-09-29 09:18:23,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:18:23,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:25,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 09:18:30,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:18:30,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:18:33,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:36,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:18:42,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:47,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 09:18:51,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 09:18:53,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:18:53,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:18:53,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:18:53,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:56,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 09:19:00,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:00,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:01,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=318986.6666666667, ans=0.125 2023-09-29 09:19:06,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:08,846 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 09:19:10,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:19:12,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:13,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:19:13,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 09:19:15,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:19:15,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:19:16,487 INFO [train.py:1039] (2/4) Epoch 10, batch 50, loss[loss=0.2142, simple_loss=0.2937, pruned_loss=0.06732, over 24642.00 frames. ], tot_loss[loss=0.2163, simple_loss=0.2826, pruned_loss=0.07503, over 1048822.46 frames. ], batch size: 68, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:19:18,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:18,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:22,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:25,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 09:19:25,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:32,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:19:34,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 09:19:37,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 09:19:40,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:19:41,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:19:41,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:43,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:19:43,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=319120.0, ans=0.1 2023-09-29 09:19:44,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:19:44,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:19:44,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:52,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:19:53,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:54,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:19:55,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 09:19:57,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:19:57,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:19:57,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 09:19:58,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:00,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 09:20:06,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:07,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:20:08,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:08,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=319253.3333333333, ans=0.05 2023-09-29 09:20:10,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:10,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:14,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 09:20:14,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 09:20:15,200 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:20:16,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:16,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:17,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:20:18,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:18,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 09:20:19,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 09:20:20,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:20:22,300 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.176e+02 2.452e+02 2.821e+02 3.971e+02, threshold=4.904e+02, percent-clipped=0.0 2023-09-29 09:20:22,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:22,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:20:24,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 09:20:24,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 09:20:24,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:25,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:27,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:20:27,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:20:30,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:20:30,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=319320.0, ans=0.2 2023-09-29 09:20:34,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:20:36,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:38,818 INFO [train.py:1039] (2/4) Epoch 10, batch 100, loss[loss=0.213, simple_loss=0.2782, pruned_loss=0.07389, over 23319.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.2818, pruned_loss=0.07295, over 1873152.81 frames. ], batch size: 105, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:20:38,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 09:20:38,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:45,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:20:45,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:45,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:45,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:45,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:46,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=319386.6666666667, ans=0.125 2023-09-29 09:20:47,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 09:20:49,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:20:49,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:49,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:49,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:55,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 09:20:55,603 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:20:56,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:57,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:58,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:20:58,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=319453.3333333333, ans=0.2 2023-09-29 09:21:01,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:21:04,756 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 09:21:04,794 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 09:21:06,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:06,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:21:10,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:21:12,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:21:13,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=319520.0, ans=0.125 2023-09-29 09:21:14,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:21,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:21,262 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 09:21:23,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:21:27,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:21:29,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:21:30,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:33,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:37,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:37,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=319586.6666666667, ans=10.0 2023-09-29 09:21:38,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:21:39,209 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.39 vs. limit=22.5 2023-09-29 09:21:41,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:41,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=319653.3333333333, ans=0.125 2023-09-29 09:21:43,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:43,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:43,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:21:45,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:45,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 09:21:45,628 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 09:21:47,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:48,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:21:48,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:48,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:48,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 09:21:48,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:21:50,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:21:50,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:51,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:51,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:51,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:21:53,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:21:55,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:59,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:59,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:00,934 INFO [train.py:1039] (2/4) Epoch 10, batch 150, loss[loss=0.2071, simple_loss=0.2855, pruned_loss=0.06429, over 24453.00 frames. ], tot_loss[loss=0.2141, simple_loss=0.2821, pruned_loss=0.07307, over 2500271.98 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:22:01,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:02,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:02,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=319720.0, ans=0.1 2023-09-29 09:22:04,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:07,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:22:08,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:11,024 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.11 vs. limit=15.0 2023-09-29 09:22:11,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=319720.0, ans=0.1 2023-09-29 09:22:13,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 09:22:13,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 09:22:13,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 09:22:16,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:22:16,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:22:16,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=319786.6666666667, ans=0.125 2023-09-29 09:22:17,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:22:18,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=319786.6666666667, ans=0.125 2023-09-29 09:22:18,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=319786.6666666667, ans=0.0 2023-09-29 09:22:19,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:22:19,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:19,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:19,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:21,653 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 09:22:23,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:25,131 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:22:29,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:32,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:22:35,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 09:22:38,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:22:38,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:38,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:22:41,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:22:44,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:44,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:22:46,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:47,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 09:22:53,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:55,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:22:55,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:22:55,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:22:59,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:23:00,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 09:23:02,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:23:04,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:23:05,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:08,105 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.959e+02 2.278e+02 2.639e+02 3.877e+02, threshold=4.556e+02, percent-clipped=0.0 2023-09-29 09:23:08,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:23:08,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 09:23:08,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:23:08,390 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 09:23:17,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:22,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:23:23,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:23:25,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 09:23:26,911 INFO [train.py:1039] (2/4) Epoch 10, batch 200, loss[loss=0.2158, simple_loss=0.2815, pruned_loss=0.07501, over 23199.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.2822, pruned_loss=0.07222, over 3004284.41 frames. ], batch size: 119, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:23:27,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:27,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:30,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=320053.3333333333, ans=0.125 2023-09-29 09:23:31,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 09:23:32,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:23:34,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:35,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:23:39,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:23:41,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:41,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:00,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:24:02,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:24:03,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:24:05,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:24:05,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=320186.6666666667, ans=0.04949747468305833 2023-09-29 09:24:06,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:24:06,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:24:08,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:08,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:24:09,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:09,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:11,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=320186.6666666667, ans=0.0 2023-09-29 09:24:12,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 09:24:12,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:24:12,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:17,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:24:25,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:30,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:32,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:24:38,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:41,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 09:24:41,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=320320.0, ans=0.125 2023-09-29 09:24:42,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:42,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:24:42,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:43,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:24:44,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 09:24:44,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:24:44,716 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 09:24:46,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:47,877 INFO [train.py:1039] (2/4) Epoch 10, batch 250, loss[loss=0.1999, simple_loss=0.2762, pruned_loss=0.06182, over 23938.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2807, pruned_loss=0.07151, over 3394061.85 frames. ], batch size: 86, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:24:50,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:24:51,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:51,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:53,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:24:53,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:53,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=320386.6666666667, ans=0.0 2023-09-29 09:24:55,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=320386.6666666667, ans=0.5 2023-09-29 09:24:57,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:25:00,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:04,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=320453.3333333333, ans=0.0 2023-09-29 09:25:07,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=320453.3333333333, ans=0.125 2023-09-29 09:25:10,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:11,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.77 vs. limit=22.5 2023-09-29 09:25:13,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:25:14,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:25:18,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=320453.3333333333, ans=0.125 2023-09-29 09:25:18,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=320453.3333333333, ans=0.0 2023-09-29 09:25:21,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:25:21,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:25:22,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:25:22,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:22,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=320520.0, ans=0.0 2023-09-29 09:25:24,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:25:24,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:25:26,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:29,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:25:33,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 09:25:33,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:35,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:25:35,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:25:36,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:25:37,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:25:38,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:25:38,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:25:40,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:40,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=320586.6666666667, ans=0.0 2023-09-29 09:25:41,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:25:41,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:25:46,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:25:46,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=320586.6666666667, ans=0.125 2023-09-29 09:25:50,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:53,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:55,084 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.019e+02 2.235e+02 2.582e+02 3.547e+02, threshold=4.469e+02, percent-clipped=0.0 2023-09-29 09:25:58,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=320653.3333333333, ans=0.0 2023-09-29 09:25:59,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:00,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=320653.3333333333, ans=0.125 2023-09-29 09:26:02,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:26:07,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 09:26:07,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:08,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:26:08,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 09:26:10,173 INFO [train.py:1039] (2/4) Epoch 10, batch 300, loss[loss=0.2344, simple_loss=0.289, pruned_loss=0.0899, over 23862.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2796, pruned_loss=0.07131, over 3686668.71 frames. ], batch size: 164, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:26:10,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:26:11,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:26:11,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 09:26:16,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:26:18,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:26:21,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:26:21,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 09:26:23,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:24,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:26:24,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 09:26:24,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:27,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:26:32,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:26:34,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 09:26:38,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 09:26:38,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:41,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:41,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=320853.3333333333, ans=0.125 2023-09-29 09:26:42,434 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.84 vs. limit=6.0 2023-09-29 09:26:43,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:43,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 09:26:43,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:26:45,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:26:46,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:26:48,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:51,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=320853.3333333333, ans=0.125 2023-09-29 09:26:52,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:26:53,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 09:26:53,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:26:56,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:57,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 09:26:57,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:01,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:27:04,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:27:04,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 09:27:07,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:07,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:27:10,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=320920.0, ans=0.035 2023-09-29 09:27:11,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:11,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=320920.0, ans=0.2 2023-09-29 09:27:14,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:27:14,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 09:27:14,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:27:15,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:17,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 09:27:19,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=320986.6666666667, ans=0.0 2023-09-29 09:27:20,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:21,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:22,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:22,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:22,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:27,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:27,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:27:30,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:31,836 INFO [train.py:1039] (2/4) Epoch 10, batch 350, loss[loss=0.2097, simple_loss=0.2448, pruned_loss=0.08735, over 18917.00 frames. ], tot_loss[loss=0.2089, simple_loss=0.2766, pruned_loss=0.07057, over 3898610.88 frames. ], batch size: 388, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:27:35,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:39,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=321053.3333333333, ans=0.0 2023-09-29 09:27:40,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:40,893 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-09-29 09:27:41,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:45,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 09:27:47,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:47,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 09:27:49,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:50,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 09:27:51,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:55,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 09:27:57,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:27:58,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:59,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.47 vs. limit=15.0 2023-09-29 09:28:00,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:28:02,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:02,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:02,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:28:02,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=321120.0, ans=0.125 2023-09-29 09:28:05,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:05,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:12,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:28:12,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:28:15,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:28:15,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:15,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=321186.6666666667, ans=0.1 2023-09-29 09:28:21,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 09:28:21,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:27,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:27,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:27,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:28:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 09:28:32,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:32,378 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 09:28:35,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 09:28:35,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:39,694 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 2.071e+02 2.540e+02 3.083e+02 5.946e+02, threshold=5.081e+02, percent-clipped=4.0 2023-09-29 09:28:39,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:39,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 09:28:43,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:44,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:28:46,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:47,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:47,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:50,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:50,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=321320.0, ans=0.1 2023-09-29 09:28:53,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:55,241 INFO [train.py:1039] (2/4) Epoch 10, batch 400, loss[loss=0.2226, simple_loss=0.2899, pruned_loss=0.07766, over 23411.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2762, pruned_loss=0.07, over 4084585.48 frames. ], batch size: 106, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:28:56,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:28:58,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 09:28:58,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:58,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:02,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:29:02,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:05,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:06,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.39 vs. limit=6.0 2023-09-29 09:29:07,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:08,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 09:29:10,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 09:29:10,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:11,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 09:29:11,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:16,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:29:16,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:16,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 09:29:18,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:29:18,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:18,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:19,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:29:22,552 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 09:29:22,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 09:29:29,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:30,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:31,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 09:29:33,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 09:29:34,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:29:36,672 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:29:37,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:29:41,506 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.77 vs. limit=15.0 2023-09-29 09:29:45,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 09:29:45,814 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=15.0 2023-09-29 09:29:48,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:29:49,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 09:29:52,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:54,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:29:55,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 09:29:58,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:30:02,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:30:04,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:30:07,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:08,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 09:30:10,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:30:11,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 09:30:15,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:30:15,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:30:16,743 INFO [train.py:1039] (2/4) Epoch 10, batch 450, loss[loss=0.212, simple_loss=0.2726, pruned_loss=0.07568, over 23789.00 frames. ], tot_loss[loss=0.2095, simple_loss=0.2776, pruned_loss=0.07069, over 4233826.26 frames. ], batch size: 212, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:30:16,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 09:30:18,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:30:18,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:30:20,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:30:22,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 09:30:22,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:30:23,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:30:25,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:30:25,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 09:30:25,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=321720.0, ans=0.1 2023-09-29 09:30:26,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:30:26,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:30:29,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:30:30,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=15.0 2023-09-29 09:30:37,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:38,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:30:41,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 09:30:41,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 09:30:45,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:30:48,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:51,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:30:56,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:56,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:59,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 09:30:59,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 09:31:00,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 09:31:02,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:02,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:03,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:31:05,369 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 09:31:05,383 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 09:31:05,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:31:06,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:31:09,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:31:12,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:31:14,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:31:14,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:31:14,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=321920.0, ans=0.07 2023-09-29 09:31:15,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 09:31:16,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=321920.0, ans=0.125 2023-09-29 09:31:17,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:20,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:31:21,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:31:21,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 09:31:25,376 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.898e+02 2.147e+02 2.621e+02 3.477e+02, threshold=4.294e+02, percent-clipped=0.0 2023-09-29 09:31:25,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:31:27,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 09:31:29,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 09:31:30,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:35,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:31:36,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:31:38,102 INFO [train.py:1039] (2/4) Epoch 10, batch 500, loss[loss=0.2049, simple_loss=0.277, pruned_loss=0.06644, over 24416.00 frames. ], tot_loss[loss=0.2091, simple_loss=0.2777, pruned_loss=0.07024, over 4347518.91 frames. ], batch size: 58, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:31:39,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:31:39,675 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 09:31:43,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:45,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:31:46,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:46,872 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 09:31:48,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 09:31:48,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:51,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:31:54,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:31:57,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:32:00,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:32:00,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:32:02,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:06,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=322120.0, ans=0.0 2023-09-29 09:32:11,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:11,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:32:12,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:32:12,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:12,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 09:32:12,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:32:17,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:32:19,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:32:19,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:32:19,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:21,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 09:32:22,320 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.32 vs. limit=15.0 2023-09-29 09:32:24,637 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 09:32:26,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:27,393 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.59 vs. limit=15.0 2023-09-29 09:32:27,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:30,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:32:33,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 09:32:33,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=322253.3333333333, ans=0.125 2023-09-29 09:32:34,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:32:37,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:41,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:32:44,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:48,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:51,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=322320.0, ans=0.0 2023-09-29 09:32:52,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 09:32:52,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:52,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:52,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=322320.0, ans=0.0 2023-09-29 09:32:56,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 09:32:56,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:32:59,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:00,710 INFO [train.py:1039] (2/4) Epoch 10, batch 550, loss[loss=0.236, simple_loss=0.2955, pruned_loss=0.08827, over 23499.00 frames. ], tot_loss[loss=0.2107, simple_loss=0.2793, pruned_loss=0.07103, over 4435080.79 frames. ], batch size: 256, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:33:04,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 09:33:06,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=322386.6666666667, ans=0.1 2023-09-29 09:33:07,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 09:33:07,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:07,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 09:33:08,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:33:09,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:09,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=322386.6666666667, ans=0.0 2023-09-29 09:33:10,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:12,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:12,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:33:14,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:33:15,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:17,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 09:33:17,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:33:20,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:20,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:23,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:25,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:27,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=322453.3333333333, ans=0.125 2023-09-29 09:33:29,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 09:33:31,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 09:33:32,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:33:34,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=322520.0, ans=0.125 2023-09-29 09:33:38,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:33:38,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:39,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:33:43,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:44,002 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 09:33:46,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:47,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:33:49,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:49,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:33:49,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:33:51,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=322586.6666666667, ans=0.0 2023-09-29 09:33:52,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:52,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 09:33:54,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 09:33:55,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:33:55,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:57,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:33:57,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:34:00,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:34:00,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:34:02,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:34:04,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:07,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:34:08,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:34:10,438 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.009e+02 2.272e+02 2.657e+02 5.113e+02, threshold=4.543e+02, percent-clipped=1.0 2023-09-29 09:34:10,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:12,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:34:12,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:14,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:34:14,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:34:21,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 09:34:22,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 09:34:24,162 INFO [train.py:1039] (2/4) Epoch 10, batch 600, loss[loss=0.2065, simple_loss=0.2658, pruned_loss=0.07363, over 23803.00 frames. ], tot_loss[loss=0.211, simple_loss=0.2795, pruned_loss=0.07124, over 4498592.77 frames. ], batch size: 232, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:34:24,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:34:24,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:34:24,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:31,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:34:35,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:34:37,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 09:34:39,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:34:40,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:34:41,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=322786.6666666667, ans=0.125 2023-09-29 09:34:42,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:46,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 09:34:46,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:34:46,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=322786.6666666667, ans=0.125 2023-09-29 09:34:52,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 09:34:55,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:34:55,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:55,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:35:04,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:35:04,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:35:04,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:12,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:35:16,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:16,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:35:16,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:35:17,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=322920.0, ans=0.125 2023-09-29 09:35:24,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 09:35:28,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=12.23 vs. limit=15.0 2023-09-29 09:35:28,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:35:29,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:35:33,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 09:35:33,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:35:36,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 09:35:38,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:35:38,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:35:44,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:35:46,800 INFO [train.py:1039] (2/4) Epoch 10, batch 650, loss[loss=0.1979, simple_loss=0.2781, pruned_loss=0.05883, over 24486.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2789, pruned_loss=0.07111, over 4541613.25 frames. ], batch size: 66, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:35:46,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:35:50,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:35:51,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:35:55,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:35:56,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 09:35:56,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:57,572 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.82 vs. limit=15.0 2023-09-29 09:35:59,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=323053.3333333333, ans=0.0 2023-09-29 09:36:03,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:36:03,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:06,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:11,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 09:36:14,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:14,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:19,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:19,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:36:23,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:23,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:23,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:36:24,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:25,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:36:28,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:36:28,090 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 09:36:28,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:28,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:31,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=323186.6666666667, ans=0.125 2023-09-29 09:36:33,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:33,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:35,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:35,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=323253.3333333333, ans=0.0 2023-09-29 09:36:36,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:36:37,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 09:36:38,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:36:38,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:36:38,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:36:38,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:40,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:36:41,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 09:36:44,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.14 vs. limit=6.0 2023-09-29 09:36:44,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 09:36:44,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:44,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:44,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:36:45,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:46,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:53,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:55,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:56,669 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.920e+02 2.159e+02 2.398e+02 3.616e+02, threshold=4.317e+02, percent-clipped=0.0 2023-09-29 09:36:56,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:59,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:59,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:37:01,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:37:09,371 INFO [train.py:1039] (2/4) Epoch 10, batch 700, loss[loss=0.2002, simple_loss=0.2864, pruned_loss=0.05698, over 24648.00 frames. ], tot_loss[loss=0.2086, simple_loss=0.2769, pruned_loss=0.07012, over 4571834.96 frames. ], batch size: 68, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:37:09,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:37:09,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:09,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:10,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:14,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 09:37:16,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 09:37:19,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 09:37:19,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:19,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=323386.6666666667, ans=0.0 2023-09-29 09:37:20,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:37:22,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 09:37:22,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=323386.6666666667, ans=0.07 2023-09-29 09:37:25,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.66 vs. limit=15.0 2023-09-29 09:37:29,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:31,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:37:32,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:32,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:37:34,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:37:36,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:39,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:37:39,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:37:42,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 09:37:45,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 09:37:49,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:37:50,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:37:52,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:37:57,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:37:57,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 09:38:04,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:04,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:38:05,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 09:38:10,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:38:10,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:15,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:38:15,941 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:38:18,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:38:18,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 09:38:22,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 09:38:22,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 09:38:25,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:28,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:30,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:38:32,729 INFO [train.py:1039] (2/4) Epoch 10, batch 750, loss[loss=0.2143, simple_loss=0.2847, pruned_loss=0.07194, over 23640.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2763, pruned_loss=0.06952, over 4614987.47 frames. ], batch size: 85, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:38:32,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:32,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 09:38:33,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=323720.0, ans=0.125 2023-09-29 09:38:36,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=323720.0, ans=0.125 2023-09-29 09:38:37,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 09:38:37,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 09:38:38,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 09:38:38,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 09:38:39,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 09:38:40,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:38:41,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 09:38:42,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:43,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:38:43,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:45,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:46,175 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.69 vs. limit=10.0 2023-09-29 09:38:46,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:38:46,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:50,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:38:50,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:38:53,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:38:53,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=323786.6666666667, ans=0.125 2023-09-29 09:38:55,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:57,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:57,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 09:38:58,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:38:58,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:00,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:03,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:39:05,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 09:39:05,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:07,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 09:39:07,169 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 09:39:07,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 09:39:07,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:39:07,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:39:10,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:39:16,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.54 vs. limit=12.0 2023-09-29 09:39:17,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:39:17,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:17,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:39:21,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:39:21,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:39:22,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 09:39:22,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:39:24,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:39:25,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:39:29,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:39:29,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 09:39:30,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:35,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.39 vs. limit=15.0 2023-09-29 09:39:36,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:39:38,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:39:38,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:39:39,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=323986.6666666667, ans=0.0 2023-09-29 09:39:41,887 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.983e+02 2.343e+02 2.893e+02 4.717e+02, threshold=4.686e+02, percent-clipped=1.0 2023-09-29 09:39:41,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:39:43,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 09:39:43,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:39:43,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:51,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:53,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:39:54,407 INFO [train.py:1039] (2/4) Epoch 10, batch 800, loss[loss=0.1926, simple_loss=0.2783, pruned_loss=0.05347, over 24418.00 frames. ], tot_loss[loss=0.2082, simple_loss=0.2772, pruned_loss=0.06966, over 4632214.48 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:40:00,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:40:00,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:02,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:40:02,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:04,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:04,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:06,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:12,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:12,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:40:16,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 09:40:16,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:19,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:19,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:40:19,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:19,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 09:40:21,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:21,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 09:40:23,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=324120.0, ans=0.125 2023-09-29 09:40:24,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:25,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:27,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:40:27,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:30,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:30,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:35,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:40:35,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:40:35,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:40:35,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=324186.6666666667, ans=0.0 2023-09-29 09:40:37,051 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 09:40:38,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 09:40:38,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:40:38,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:40:38,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:38,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=324186.6666666667, ans=0.1 2023-09-29 09:40:40,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:40:41,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.49 vs. limit=22.5 2023-09-29 09:40:46,249 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 09:40:46,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 09:40:49,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:40:49,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:40:51,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:40:54,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:56,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 09:40:57,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=324253.3333333333, ans=0.04949747468305833 2023-09-29 09:40:57,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:41:01,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 09:41:09,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:11,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:41:12,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 09:41:13,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:41:14,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:16,491 INFO [train.py:1039] (2/4) Epoch 10, batch 850, loss[loss=0.2249, simple_loss=0.2779, pruned_loss=0.08593, over 23867.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.2786, pruned_loss=0.07084, over 4643538.39 frames. ], batch size: 179, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:41:16,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 09:41:16,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:16,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=324386.6666666667, ans=0.0 2023-09-29 09:41:19,015 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:41:20,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:41:21,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:23,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:41:25,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:41:26,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 09:41:28,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 09:41:28,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 09:41:28,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:28,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:41:30,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:31,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:31,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:41:37,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:37,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:41:37,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 09:41:42,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 09:41:45,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:47,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 09:41:48,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=324520.0, ans=0.0 2023-09-29 09:41:53,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 09:41:55,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 09:41:57,559 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 09:41:57,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:41:57,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:41:57,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:42:00,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 09:42:05,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:42:05,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:06,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:42:06,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:42:06,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=324586.6666666667, ans=0.125 2023-09-29 09:42:08,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:42:09,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:42:10,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=324586.6666666667, ans=0.125 2023-09-29 09:42:11,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 09:42:13,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=324586.6666666667, ans=0.125 2023-09-29 09:42:14,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:42:14,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:16,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:42:16,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:17,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:19,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:22,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:42:22,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:42:24,597 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 1.918e+02 2.187e+02 2.530e+02 4.309e+02, threshold=4.375e+02, percent-clipped=0.0 2023-09-29 09:42:24,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:26,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:42:33,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:42:35,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:35,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=324653.3333333333, ans=0.125 2023-09-29 09:42:36,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 09:42:36,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:36,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:38,085 INFO [train.py:1039] (2/4) Epoch 10, batch 900, loss[loss=0.207, simple_loss=0.2861, pruned_loss=0.06401, over 24452.00 frames. ], tot_loss[loss=0.2094, simple_loss=0.2787, pruned_loss=0.07008, over 4674956.91 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:42:41,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 09:42:44,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=324720.0, ans=0.2 2023-09-29 09:42:45,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.65 vs. limit=15.0 2023-09-29 09:42:45,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:42:47,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:47,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 09:42:49,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:42:51,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 09:42:52,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:42:54,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:54,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:42:54,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:42:54,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:43:08,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:09,102 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-09-29 09:43:09,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:43:09,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:43:11,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=324853.3333333333, ans=0.125 2023-09-29 09:43:14,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:18,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 09:43:20,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:43:25,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:43:25,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:43:25,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=324920.0, ans=0.125 2023-09-29 09:43:26,956 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 09:43:27,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 09:43:27,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=324920.0, ans=0.025 2023-09-29 09:43:35,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:43:35,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:43:36,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:43:44,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:44,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:43:45,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 09:43:47,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:48,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 09:43:48,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:43:50,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:50,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:43:51,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:43:55,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 09:43:56,503 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 09:43:58,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:43:59,822 INFO [train.py:1039] (2/4) Epoch 10, batch 950, loss[loss=0.2064, simple_loss=0.2559, pruned_loss=0.07852, over 22612.00 frames. ], tot_loss[loss=0.2084, simple_loss=0.278, pruned_loss=0.06938, over 4695457.85 frames. ], batch size: 322, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:43:59,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 09:44:02,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:08,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 09:44:12,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:14,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:44:18,310 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 09:44:21,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:21,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:22,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:24,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:44:24,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 09:44:24,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:44:27,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:28,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 09:44:29,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:32,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:32,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=325186.6666666667, ans=0.125 2023-09-29 09:44:34,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 09:44:37,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:44:37,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:40,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:44:45,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:44:45,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:49,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 09:44:50,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:44:50,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:44:52,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:44:53,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:53,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:44:57,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 09:44:58,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:45:00,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:01,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:01,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 09:45:02,462 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.92 vs. limit=10.0 2023-09-29 09:45:03,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:03,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:45:03,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 09:45:07,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:45:09,618 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.117e+02 2.443e+02 3.103e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-29 09:45:09,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:15,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:17,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 09:45:17,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 09:45:22,954 INFO [train.py:1039] (2/4) Epoch 10, batch 1000, loss[loss=0.2384, simple_loss=0.3048, pruned_loss=0.08597, over 24013.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2771, pruned_loss=0.06948, over 4688880.66 frames. ], batch size: 86, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:45:23,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:27,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 09:45:27,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:45:35,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:45:35,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 09:45:35,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 09:45:35,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=325386.6666666667, ans=0.1 2023-09-29 09:45:35,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=325386.6666666667, ans=0.1 2023-09-29 09:45:37,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=325453.3333333333, ans=0.07 2023-09-29 09:45:39,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:39,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:43,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:46,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 09:45:47,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=325453.3333333333, ans=0.125 2023-09-29 09:45:50,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 09:45:50,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 09:45:51,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:45:53,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 09:45:54,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 09:45:54,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 09:45:56,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.61 vs. limit=10.0 2023-09-29 09:45:57,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:57,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:45:58,320 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.48 vs. limit=10.0 2023-09-29 09:45:59,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten.whitening_limit, batch_count=325520.0, ans=15.0 2023-09-29 09:46:01,332 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.01 vs. limit=15.0 2023-09-29 09:46:02,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=325520.0, ans=0.0 2023-09-29 09:46:06,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:06,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:46:08,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:08,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:08,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 09:46:09,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:11,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:46:11,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:11,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff3.min_abs, batch_count=325586.6666666667, ans=0.2 2023-09-29 09:46:12,811 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 09:46:14,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 09:46:16,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 09:46:18,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 09:46:18,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=325586.6666666667, ans=0.0 2023-09-29 09:46:18,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.71 vs. limit=15.0 2023-09-29 09:46:21,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:46:28,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:28,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:46:30,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:30,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:46:32,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=325653.3333333333, ans=0.125 2023-09-29 09:46:33,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 09:46:34,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:46:35,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 09:46:35,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 09:46:36,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:46:36,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:38,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:46:41,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:46:42,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:44,480 INFO [train.py:1039] (2/4) Epoch 10, batch 1050, loss[loss=0.2015, simple_loss=0.2647, pruned_loss=0.06912, over 23829.00 frames. ], tot_loss[loss=0.207, simple_loss=0.276, pruned_loss=0.069, over 4703828.62 frames. ], batch size: 212, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:46:44,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=325720.0, ans=0.125 2023-09-29 09:46:46,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:46:47,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:46:49,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:46:51,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:52,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:46:54,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:46:55,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:46:59,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:46:59,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:46:59,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:47:00,382 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.54 vs. limit=15.0 2023-09-29 09:47:01,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:47:03,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 09:47:03,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:04,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 09:47:06,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:47:06,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 09:47:06,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:47:15,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:47:15,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:47:15,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:19,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 09:47:20,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 09:47:20,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:47:22,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 09:47:25,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 09:47:26,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:47:31,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:47:32,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=325920.0, ans=0.025 2023-09-29 09:47:34,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:47:34,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:47:35,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:47:40,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:47:43,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 09:47:44,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 09:47:45,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 09:47:45,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:45,722 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.86 vs. limit=15.0 2023-09-29 09:47:46,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:47:47,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 09:47:51,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:47:52,401 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.027e+02 2.343e+02 2.792e+02 3.800e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-29 09:47:54,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:54,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:47:56,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:47:56,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 09:48:04,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:48:04,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 09:48:04,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 09:48:05,651 INFO [train.py:1039] (2/4) Epoch 10, batch 1100, loss[loss=0.1914, simple_loss=0.2713, pruned_loss=0.05579, over 24481.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2751, pruned_loss=0.06791, over 4711599.90 frames. ], batch size: 63, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:48:05,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:48:11,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:14,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:48:17,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:48:18,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:48:19,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:19,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 09:48:20,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:48:22,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:48:26,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:48:27,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=326120.0, ans=10.0 2023-09-29 09:48:29,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:48:29,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 09:48:30,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:48:32,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:32,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:48:35,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:48:37,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=326186.6666666667, ans=0.125 2023-09-29 09:48:38,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:48:44,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:48:47,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 09:48:47,646 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 09:48:47,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:52,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:53,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:48:53,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:55,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 09:48:56,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:48:56,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:48:56,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:48:56,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:58,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 09:49:04,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:49:04,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 09:49:05,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=326253.3333333333, ans=0.125 2023-09-29 09:49:08,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:49:11,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:49:13,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 09:49:13,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:49:14,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:18,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:18,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:20,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 09:49:21,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:49:21,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:23,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 09:49:23,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:49:23,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 09:49:25,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:49:25,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:49:26,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:49:28,301 INFO [train.py:1039] (2/4) Epoch 10, batch 1150, loss[loss=0.1753, simple_loss=0.251, pruned_loss=0.04981, over 24340.00 frames. ], tot_loss[loss=0.206, simple_loss=0.2753, pruned_loss=0.06834, over 4707289.76 frames. ], batch size: 56, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:49:31,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:34,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:49:36,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:36,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:49:38,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 09:49:38,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:49:41,204 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.29 vs. limit=15.0 2023-09-29 09:49:41,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 09:49:43,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:43,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:49:44,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.07 vs. limit=22.5 2023-09-29 09:49:49,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 09:49:51,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:55,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:55,404 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=326453.3333333333, ans=0.125 2023-09-29 09:49:56,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:49:58,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 09:49:58,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:49:58,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:50:04,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 09:50:05,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:07,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:50:17,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:22,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=326586.6666666667, ans=0.0 2023-09-29 09:50:23,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:23,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 09:50:25,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:25,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:30,527 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 09:50:30,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:37,352 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.028e+02 2.366e+02 3.044e+02 5.235e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 09:50:39,074 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 09:50:44,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:45,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:50:45,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:50:45,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:50:49,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:50:50,929 INFO [train.py:1039] (2/4) Epoch 10, batch 1200, loss[loss=0.2058, simple_loss=0.2918, pruned_loss=0.05989, over 24649.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2767, pruned_loss=0.06933, over 4703475.85 frames. ], batch size: 68, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:50:54,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:50:54,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:50:57,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:57,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:57,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:51:00,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:51:02,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:51:04,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:04,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:07,444 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 09:51:10,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 09:51:15,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:51:17,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:51:20,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:20,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:51:20,423 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 09:51:23,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:28,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=326853.3333333333, ans=0.125 2023-09-29 09:51:31,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:51:31,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:51:31,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 09:51:33,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:51:34,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 09:51:40,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 09:51:40,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:42,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:43,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:44,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:51:45,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:45,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:51:46,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:51:47,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 09:51:47,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:51:49,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:51:49,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:51:50,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:50,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:56,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:51:58,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:52:02,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 09:52:05,211 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 09:52:08,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:11,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:52:13,021 INFO [train.py:1039] (2/4) Epoch 10, batch 1250, loss[loss=0.297, simple_loss=0.3396, pruned_loss=0.1271, over 19735.00 frames. ], tot_loss[loss=0.2095, simple_loss=0.278, pruned_loss=0.07049, over 4702409.11 frames. ], batch size: 389, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:52:13,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:52:15,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:52:16,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 09:52:21,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:52:22,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:22,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 09:52:26,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:52:26,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:52:30,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=327120.0, ans=0.0 2023-09-29 09:52:31,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:52:33,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:33,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:52:33,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:36,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:52:40,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:52:40,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:52:41,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:42,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:43,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:47,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:52:49,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:52:51,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=327186.6666666667, ans=0.0 2023-09-29 09:52:52,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 09:52:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:52:54,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=327186.6666666667, ans=0.2 2023-09-29 09:52:54,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=327186.6666666667, ans=0.125 2023-09-29 09:52:57,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:52:57,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 09:52:58,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:59,560 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 09:52:59,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:59,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:04,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:06,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:07,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:53:09,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 09:53:09,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 09:53:09,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 09:53:10,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:12,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 09:53:12,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:16,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:53:16,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:53:17,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 09:53:18,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:53:19,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:53:19,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:53:20,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:53:21,756 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 2.035e+02 2.245e+02 2.590e+02 3.760e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 09:53:21,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 09:53:26,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:26,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=327320.0, ans=0.0 2023-09-29 09:53:28,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:53:30,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:53:35,028 INFO [train.py:1039] (2/4) Epoch 10, batch 1300, loss[loss=0.2312, simple_loss=0.303, pruned_loss=0.07966, over 23682.00 frames. ], tot_loss[loss=0.2112, simple_loss=0.2795, pruned_loss=0.07147, over 4701560.87 frames. ], batch size: 85, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:53:35,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:53:35,900 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.52 vs. limit=15.0 2023-09-29 09:53:37,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:38,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 09:53:43,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:45,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:53:48,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:53:49,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:49,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:53:51,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 09:53:55,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:53:56,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:53:56,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=327453.3333333333, ans=0.125 2023-09-29 09:53:57,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 09:54:01,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:54:04,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:05,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:06,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:54:08,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:08,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:54:10,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:54:11,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 09:54:17,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:54:18,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:54:19,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 09:54:21,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:54:24,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:54:26,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:54:26,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 09:54:27,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:27,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 09:54:29,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:33,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:33,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:54:36,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 09:54:37,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=327586.6666666667, ans=0.125 2023-09-29 09:54:38,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 09:54:40,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 09:54:47,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:54:48,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=327653.3333333333, ans=0.0 2023-09-29 09:54:49,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 09:54:51,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:55,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=327720.0, ans=0.1 2023-09-29 09:54:56,845 INFO [train.py:1039] (2/4) Epoch 10, batch 1350, loss[loss=0.2035, simple_loss=0.255, pruned_loss=0.07601, over 23565.00 frames. ], tot_loss[loss=0.2095, simple_loss=0.2774, pruned_loss=0.07075, over 4695721.09 frames. ], batch size: 256, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:54:58,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 09:55:03,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:03,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=327720.0, ans=0.125 2023-09-29 09:55:05,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:07,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:55:07,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:10,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:55:11,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:15,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:19,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 09:55:19,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:55:20,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:55:22,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 09:55:24,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:55:25,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:55:25,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 09:55:27,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 09:55:27,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=327786.6666666667, ans=0.125 2023-09-29 09:55:28,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 09:55:31,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:32,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 09:55:43,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:48,455 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.78 vs. limit=15.0 2023-09-29 09:55:53,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:55,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:55:55,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 09:55:58,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:55:59,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=327920.0, ans=0.1 2023-09-29 09:56:00,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 09:56:00,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:56:00,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:56:03,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:56:06,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 09:56:06,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:56:09,400 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.130e+02 2.395e+02 2.953e+02 6.223e+02, threshold=4.790e+02, percent-clipped=2.0 2023-09-29 09:56:12,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 09:56:14,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 09:56:18,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=327986.6666666667, ans=0.0 2023-09-29 09:56:20,603 INFO [train.py:1039] (2/4) Epoch 10, batch 1400, loss[loss=0.2413, simple_loss=0.2995, pruned_loss=0.09152, over 23459.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.275, pruned_loss=0.07059, over 4684948.19 frames. ], batch size: 120, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:56:20,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 09:56:23,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:56:25,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:56:27,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:56:34,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 09:56:35,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 09:56:45,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:56:47,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:56:49,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:56:49,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=328120.0, ans=0.0 2023-09-29 09:56:50,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:56:54,609 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.66 vs. limit=15.0 2023-09-29 09:56:55,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:56:56,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.56 vs. limit=15.0 2023-09-29 09:56:56,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:56:59,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=328186.6666666667, ans=15.0 2023-09-29 09:57:07,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:08,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:11,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 09:57:11,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:57:12,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:57:14,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:57:14,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:15,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:57:15,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:57:15,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:57:17,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 09:57:17,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:57:21,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:25,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:57:31,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 09:57:33,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:57:35,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:57:37,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=328320.0, ans=0.125 2023-09-29 09:57:38,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:57:38,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:41,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:57:42,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=328386.6666666667, ans=0.1 2023-09-29 09:57:43,362 INFO [train.py:1039] (2/4) Epoch 10, batch 1450, loss[loss=0.2068, simple_loss=0.2699, pruned_loss=0.07182, over 23623.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2751, pruned_loss=0.07077, over 4679460.63 frames. ], batch size: 149, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:57:45,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:57:47,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:57:47,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:47,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:57:52,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:54,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:57:54,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:55,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 09:57:56,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:57:57,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 09:57:58,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:00,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:00,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 09:58:01,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:01,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:58:01,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:58:01,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:03,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:58:03,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=328453.3333333333, ans=0.125 2023-09-29 09:58:06,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:10,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:11,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:58:11,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:58:13,852 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.00 vs. limit=15.0 2023-09-29 09:58:14,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:58:14,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:19,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:58:19,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:22,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 09:58:24,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:29,233 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 09:58:30,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:32,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:58:33,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:34,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=328586.6666666667, ans=0.0 2023-09-29 09:58:35,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 09:58:41,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:41,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 09:58:43,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 09:58:45,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:49,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:58:49,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:53,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 09:58:54,753 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.987e+02 2.423e+02 2.995e+02 4.591e+02, threshold=4.846e+02, percent-clipped=0.0 2023-09-29 09:58:54,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 09:58:54,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 09:58:57,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:57,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:59:06,773 INFO [train.py:1039] (2/4) Epoch 10, batch 1500, loss[loss=0.2357, simple_loss=0.3127, pruned_loss=0.07939, over 24396.00 frames. ], tot_loss[loss=0.2093, simple_loss=0.2764, pruned_loss=0.07109, over 4682594.15 frames. ], batch size: 69, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:59:10,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 09:59:11,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:59:11,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:59:11,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:12,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:13,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=328720.0, ans=0.125 2023-09-29 09:59:14,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:59:14,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 09:59:16,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:59:16,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:59:16,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:17,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:59:19,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:59:21,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:26,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:26,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 09:59:27,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:59:29,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:59:29,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:31,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=328786.6666666667, ans=0.125 2023-09-29 09:59:33,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 09:59:37,666 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.86 vs. limit=22.5 2023-09-29 09:59:39,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 09:59:39,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:41,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 09:59:42,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:59:44,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:59:47,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:47,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:59:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 09:59:50,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:59:50,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:50,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 09:59:50,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:54,958 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.74 vs. limit=15.0 2023-09-29 09:59:57,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:59:57,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 09:59:59,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=328920.0, ans=0.125 2023-09-29 10:00:03,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:00:05,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:00:12,143 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 10:00:12,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=328986.6666666667, ans=0.1 2023-09-29 10:00:14,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:14,068 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 10:00:15,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:17,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:17,252 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 10:00:18,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:00:21,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 10:00:22,821 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.09 vs. limit=15.0 2023-09-29 10:00:23,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:26,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=329053.3333333333, ans=0.125 2023-09-29 10:00:27,776 INFO [train.py:1039] (2/4) Epoch 10, batch 1550, loss[loss=0.2076, simple_loss=0.2885, pruned_loss=0.06333, over 24572.00 frames. ], tot_loss[loss=0.2088, simple_loss=0.2767, pruned_loss=0.07044, over 4699435.27 frames. ], batch size: 71, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:00:27,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:27,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:27,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:28,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:28,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=329053.3333333333, ans=0.125 2023-09-29 10:00:29,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:00:31,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 10:00:31,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 10:00:31,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:00:32,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 10:00:33,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 10:00:35,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:35,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=329053.3333333333, ans=0.07 2023-09-29 10:00:36,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:36,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:00:36,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:00:38,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:40,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:44,174 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 10:00:44,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:44,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:00:44,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:00:47,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:00:47,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 10:00:49,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:49,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 10:00:51,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 10:00:51,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 10:00:51,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:52,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:00:54,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=329120.0, ans=0.0 2023-09-29 10:00:55,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:56,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-09-29 10:00:58,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 10:00:58,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 10:01:05,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=329186.6666666667, ans=0.0 2023-09-29 10:01:08,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:12,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:01:13,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:01:13,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:01:13,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=329186.6666666667, ans=0.1 2023-09-29 10:01:15,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 10:01:20,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:01:22,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:25,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:01:28,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:01:28,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:28,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 10:01:29,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:31,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:01:32,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:34,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:01:34,330 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 10:01:37,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:01:39,017 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.945e+02 2.181e+02 2.494e+02 3.678e+02, threshold=4.362e+02, percent-clipped=0.0 2023-09-29 10:01:44,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 10:01:49,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:50,488 INFO [train.py:1039] (2/4) Epoch 10, batch 1600, loss[loss=0.2913, simple_loss=0.3264, pruned_loss=0.1281, over 19120.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.2777, pruned_loss=0.07123, over 4698686.27 frames. ], batch size: 388, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:01:50,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:52,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 10:01:52,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=329386.6666666667, ans=0.1 2023-09-29 10:01:54,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:56,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:56,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:01:56,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:01:57,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:02:01,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:02,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 10:02:03,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 10:02:06,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 10:02:09,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:09,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 10:02:10,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:02:12,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:02:15,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=329453.3333333333, ans=0.0 2023-09-29 10:02:19,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:02:22,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 10:02:25,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:02:25,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 10:02:25,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:26,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 10:02:27,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=329520.0, ans=0.125 2023-09-29 10:02:32,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=329520.0, ans=0.125 2023-09-29 10:02:34,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 10:02:42,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:42,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 10:02:42,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=329586.6666666667, ans=0.1 2023-09-29 10:02:43,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:44,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:44,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:02:45,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:02:50,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:02:53,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:02:53,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:53,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:55,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:02:57,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:02:57,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:02:58,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:03:04,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:05,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.62 vs. limit=15.0 2023-09-29 10:03:06,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:03:07,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 10:03:07,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:03:09,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 10:03:13,619 INFO [train.py:1039] (2/4) Epoch 10, batch 1650, loss[loss=0.2186, simple_loss=0.2709, pruned_loss=0.08321, over 23653.00 frames. ], tot_loss[loss=0.2098, simple_loss=0.2776, pruned_loss=0.071, over 4700179.38 frames. ], batch size: 232, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:03:13,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:15,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:03:15,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:03:15,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 10:03:15,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 10:03:15,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=329720.0, ans=0.2 2023-09-29 10:03:15,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=329720.0, ans=0.0 2023-09-29 10:03:17,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 10:03:17,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 10:03:17,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=329720.0, ans=0.125 2023-09-29 10:03:20,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:21,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:21,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:03:21,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:03:24,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:25,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 10:03:30,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:03:30,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:30,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:03:30,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:03:31,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 10:03:31,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 10:03:32,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.42 vs. limit=22.5 2023-09-29 10:03:40,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:03:43,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:03:45,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=329853.3333333333, ans=0.0 2023-09-29 10:03:49,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 10:03:51,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:03:54,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 10:03:55,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:00,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:04:00,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:04:01,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:02,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:04:03,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:04,695 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.77 vs. limit=10.0 2023-09-29 10:04:06,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:08,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:08,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:09,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:11,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:11,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:04:16,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:17,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 10:04:20,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:20,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 10:04:20,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 10:04:21,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 10:04:22,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:22,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:04:22,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:23,879 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.033e+02 2.438e+02 2.787e+02 4.126e+02, threshold=4.877e+02, percent-clipped=0.0 2023-09-29 10:04:24,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:24,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 10:04:28,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:30,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:04:30,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:33,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 10:04:35,196 INFO [train.py:1039] (2/4) Epoch 10, batch 1700, loss[loss=0.2236, simple_loss=0.2633, pruned_loss=0.09194, over 19402.00 frames. ], tot_loss[loss=0.2085, simple_loss=0.2763, pruned_loss=0.07038, over 4698724.02 frames. ], batch size: 388, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:04:37,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:37,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:04:37,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 10:04:38,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:04:38,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:04:38,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:41,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:04:41,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:04:43,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 10:04:46,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:04:49,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=330053.3333333333, ans=0.0 2023-09-29 10:04:52,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=330120.0, ans=0.125 2023-09-29 10:04:55,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:05:01,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:05:01,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:03,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:05:03,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:08,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 10:05:09,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:05:09,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:11,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:05:12,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:05:16,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 10:05:16,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 10:05:18,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:19,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 10:05:22,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:05:31,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:32,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:33,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:35,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:05:35,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 10:05:35,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:37,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:37,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 10:05:37,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:05:37,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:39,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:39,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:05:41,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:41,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:05:41,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:43,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:05:44,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:48,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:50,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 10:05:51,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:52,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=330320.0, ans=0.125 2023-09-29 10:05:53,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:55,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 10:05:55,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=330320.0, ans=0.125 2023-09-29 10:05:58,420 INFO [train.py:1039] (2/4) Epoch 10, batch 1750, loss[loss=0.2218, simple_loss=0.3067, pruned_loss=0.0685, over 24563.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2755, pruned_loss=0.06958, over 4710205.94 frames. ], batch size: 71, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:06:01,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:04,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:04,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:06:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 10:06:06,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:06:09,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:06:09,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:14,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 10:06:16,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:17,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 10:06:17,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:19,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:06:23,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:06:26,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 10:06:28,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:06:28,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 10:06:30,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=330520.0, ans=0.125 2023-09-29 10:06:35,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:06:38,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:06:38,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:41,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:41,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:44,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:06:47,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:49,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:50,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:53,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 10:06:54,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:57,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 10:06:59,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:00,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=330586.6666666667, ans=0.125 2023-09-29 10:07:01,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:01,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:07:05,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:07:05,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 10:07:07,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:10,059 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.086e+02 2.387e+02 2.992e+02 5.082e+02, threshold=4.774e+02, percent-clipped=1.0 2023-09-29 10:07:10,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:13,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:16,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:17,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:07:20,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 10:07:20,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:21,612 INFO [train.py:1039] (2/4) Epoch 10, batch 1800, loss[loss=0.2251, simple_loss=0.2809, pruned_loss=0.08464, over 22769.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2751, pruned_loss=0.06931, over 4697222.18 frames. ], batch size: 322, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:07:21,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:07:21,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:21,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:07:21,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:07:21,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:07:24,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:07:26,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:28,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:07:31,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:34,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=330720.0, ans=0.2 2023-09-29 10:07:35,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:07:36,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:07:40,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:07:42,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=330786.6666666667, ans=0.1 2023-09-29 10:07:43,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:43,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:44,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:07:48,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:48,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 10:07:49,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:52,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:56,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 10:07:59,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 10:08:00,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 10:08:01,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:01,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:08:01,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:02,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:08:10,143 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 10:08:11,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:08:13,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:14,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 10:08:16,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 10:08:16,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:08:18,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:08:19,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:08:23,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 10:08:31,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:08:31,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 10:08:32,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:08:32,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:32,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:08:33,209 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.93 vs. limit=15.0 2023-09-29 10:08:34,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 10:08:37,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:08:37,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:08:40,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 10:08:40,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:42,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:42,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:08:42,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:44,179 INFO [train.py:1039] (2/4) Epoch 10, batch 1850, loss[loss=0.1975, simple_loss=0.2756, pruned_loss=0.05969, over 23845.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2754, pruned_loss=0.06909, over 4702776.80 frames. ], batch size: 86, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:08:44,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:45,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:08:47,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:48,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:52,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:08:52,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:08:52,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=331053.3333333333, ans=0.1 2023-09-29 10:08:58,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:09:00,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 10:09:05,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 10:09:08,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 10:09:08,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=331120.0, ans=0.0 2023-09-29 10:09:11,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:11,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 10:09:11,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 10:09:21,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:09:22,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 10:09:26,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:09:26,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:09:31,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 10:09:31,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:31,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:09:34,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:09:34,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:09:37,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:09:42,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:09:43,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:43,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:09:43,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:45,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:47,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:09:50,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 10:09:52,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:55,855 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.969e+02 2.199e+02 2.550e+02 3.875e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 10:09:57,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:09:59,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:09:59,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 10:09:59,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 10:10:01,250 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 10:10:02,705 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 10:10:04,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:10:04,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:10:04,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:05,653 INFO [train.py:1039] (2/4) Epoch 10, batch 1900, loss[loss=0.2013, simple_loss=0.2778, pruned_loss=0.06235, over 24478.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2756, pruned_loss=0.06892, over 4698416.20 frames. ], batch size: 63, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:10:05,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:05,836 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 10:10:05,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:10:07,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:07,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:10:07,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:10:09,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:10:10,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 10:10:12,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:12,138 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 10:10:12,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:10:12,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:17,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:17,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=331386.6666666667, ans=0.125 2023-09-29 10:10:21,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:10:21,818 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 10:10:23,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 10:10:23,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:24,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:10:24,885 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 10:10:24,955 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 10:10:28,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 10:10:32,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:10:35,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 10:10:35,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 10:10:37,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=331520.0, ans=0.125 2023-09-29 10:10:38,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=331520.0, ans=0.125 2023-09-29 10:10:46,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 10:10:49,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 10:10:49,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:51,248 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 10:10:51,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 10:10:52,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 10:10:52,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 10:10:52,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:10:57,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 10:11:00,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:11:04,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:04,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 10:11:06,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:11:10,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 10:11:11,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:17,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:11:17,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:11:17,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:11:17,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:11:17,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=331653.3333333333, ans=0.1 2023-09-29 10:11:19,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:11:19,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:11:21,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:11:22,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:22,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:11:27,163 INFO [train.py:1039] (2/4) Epoch 10, batch 1950, loss[loss=0.2074, simple_loss=0.2908, pruned_loss=0.06198, over 24595.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2766, pruned_loss=0.06969, over 4694724.87 frames. ], batch size: 71, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:11:27,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:11:27,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:27,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:28,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:31,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:35,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:11:36,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:36,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:11:39,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=331720.0, ans=0.125 2023-09-29 10:11:40,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 10:11:41,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:11:41,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:43,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:43,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=331786.6666666667, ans=0.0 2023-09-29 10:11:45,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:11:46,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:11:46,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:49,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:11:53,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:53,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:11:53,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:11:53,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:58,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:01,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:12:01,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:01,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:12:01,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 10:12:01,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:12:01,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:12:02,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:05,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:06,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=331853.3333333333, ans=0.125 2023-09-29 10:12:09,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:12:14,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:12:18,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:12:18,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:12:18,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 10:12:18,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=331920.0, ans=0.0 2023-09-29 10:12:19,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:24,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:12:24,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=331920.0, ans=0.1 2023-09-29 10:12:25,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:12:27,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:34,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:35,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:37,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:38,993 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.098e+02 2.334e+02 2.724e+02 3.808e+02, threshold=4.669e+02, percent-clipped=0.0 2023-09-29 10:12:39,817 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.12 vs. limit=22.5 2023-09-29 10:12:40,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:42,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:12:44,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:45,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 10:12:45,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:12:45,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:47,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 10:12:48,766 INFO [train.py:1039] (2/4) Epoch 10, batch 2000, loss[loss=0.2075, simple_loss=0.2897, pruned_loss=0.06272, over 24444.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2765, pruned_loss=0.06916, over 4717847.39 frames. ], batch size: 69, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:12:48,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:12:53,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:55,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:12:55,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:58,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:13:00,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:03,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 10:13:03,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=332120.0, ans=0.025 2023-09-29 10:13:05,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:13:06,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:13:07,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=332120.0, ans=0.0 2023-09-29 10:13:08,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 10:13:10,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:13:10,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:13:13,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:13:14,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 10:13:14,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:18,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 10:13:18,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:13:18,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=332120.0, ans=0.0 2023-09-29 10:13:20,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 10:13:21,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:25,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=332186.6666666667, ans=0.125 2023-09-29 10:13:26,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:13:27,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:13:27,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:27,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:28,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:29,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 10:13:32,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 10:13:32,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=332186.6666666667, ans=0.2 2023-09-29 10:13:34,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:34,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:39,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:39,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:13:39,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:40,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:13:42,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:42,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:44,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:44,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:45,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:48,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:50,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 10:13:54,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:13:55,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:58,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:58,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:14:04,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:07,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:07,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:09,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:14:09,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:14:10,521 INFO [train.py:1039] (2/4) Epoch 10, batch 2050, loss[loss=0.2156, simple_loss=0.2688, pruned_loss=0.08116, over 23822.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2755, pruned_loss=0.06905, over 4721323.81 frames. ], batch size: 212, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:14:12,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:12,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:14,228 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:14:15,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:15,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:21,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:14:24,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:14:24,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:25,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:14:28,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 10:14:28,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:14:31,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:14:31,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:14:40,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:40,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:42,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 10:14:43,397 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.39 vs. limit=15.0 2023-09-29 10:14:45,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:47,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 10:14:47,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:50,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:51,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:14:53,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:14:53,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:55,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:14:57,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:14:57,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:15:00,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:02,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:15:04,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:15:06,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:10,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:16,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:15:16,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 10:15:18,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=332653.3333333333, ans=0.0 2023-09-29 10:15:22,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:22,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:15:24,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:15:26,038 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.999e+02 2.333e+02 2.741e+02 4.462e+02, threshold=4.667e+02, percent-clipped=0.0 2023-09-29 10:15:27,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 10:15:30,833 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 10:15:30,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:30,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:32,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:34,224 INFO [train.py:1039] (2/4) Epoch 10, batch 2100, loss[loss=0.2239, simple_loss=0.2993, pruned_loss=0.07421, over 24408.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2736, pruned_loss=0.06873, over 4712680.65 frames. ], batch size: 69, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:15:34,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:34,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 10:15:34,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 10:15:37,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:39,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:15:41,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:15:41,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:42,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:15:42,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 10:15:44,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:15:45,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 10:15:45,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 10:15:49,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:15:49,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:15:49,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 10:15:49,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 10:15:55,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 10:15:55,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:58,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:59,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:16:03,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:16:03,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 10:16:05,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:05,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 10:16:06,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 10:16:06,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:08,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 10:16:09,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 10:16:09,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 10:16:09,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=332853.3333333333, ans=0.125 2023-09-29 10:16:12,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:16:14,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:16:17,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:20,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:20,538 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:16:22,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 10:16:24,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:24,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 10:16:27,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 10:16:28,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 10:16:33,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:16:34,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:16:36,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 10:16:38,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=332986.6666666667, ans=0.125 2023-09-29 10:16:41,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:45,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:16:45,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:16:45,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:16:45,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:16:46,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:16:47,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:47,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:16:48,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:16:48,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:50,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 10:16:51,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 10:16:51,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:16:54,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:54,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:16:54,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:16:56,149 INFO [train.py:1039] (2/4) Epoch 10, batch 2150, loss[loss=0.2276, simple_loss=0.2822, pruned_loss=0.0865, over 22723.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2733, pruned_loss=0.06846, over 4701492.52 frames. ], batch size: 322, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:16:56,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:16:56,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=333053.3333333333, ans=0.2 2023-09-29 10:16:56,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=333053.3333333333, ans=0.125 2023-09-29 10:17:03,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:17:04,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:06,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:07,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:17:07,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:07,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:17:09,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:09,951 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.93 vs. limit=15.0 2023-09-29 10:17:11,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:17:11,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:17:14,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:14,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 10:17:15,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=333120.0, ans=0.04949747468305833 2023-09-29 10:17:16,176 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.17 vs. limit=15.0 2023-09-29 10:17:21,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:21,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=333120.0, ans=0.125 2023-09-29 10:17:22,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:17:24,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:24,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:24,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:25,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:17:25,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:25,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:17:26,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=15.0 2023-09-29 10:17:27,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:17:28,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 10:17:30,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:17:30,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=333186.6666666667, ans=0.1 2023-09-29 10:17:30,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=333186.6666666667, ans=0.0 2023-09-29 10:17:31,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:32,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:34,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:17:35,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:17:37,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:38,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:17:40,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:40,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 10:17:40,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:17:43,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:44,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:47,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:49,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:17:50,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:50,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:50,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 10:17:52,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 10:17:53,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:17:53,802 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 10:17:53,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:53,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:17:54,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=333253.3333333333, ans=0.2 2023-09-29 10:17:55,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 10:17:55,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:17:55,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 10:17:56,947 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 10:17:56,948 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 10:17:57,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 10:17:58,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:59,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:59,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:18:00,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=333320.0, ans=0.0 2023-09-29 10:18:01,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:04,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:18:04,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:04,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:10,123 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.840e+02 1.996e+02 2.223e+02 3.215e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-29 10:18:13,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:18:13,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 10:18:18,088 INFO [train.py:1039] (2/4) Epoch 10, batch 2200, loss[loss=0.2095, simple_loss=0.2791, pruned_loss=0.06995, over 23781.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.274, pruned_loss=0.06851, over 4713168.63 frames. ], batch size: 135, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:18:18,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:18:24,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:25,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:18:25,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:18:27,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:18:30,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:31,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:18:31,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 10:18:37,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 10:18:40,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:18:45,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 10:18:48,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:49,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:18:50,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:18:52,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:18:52,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 10:18:57,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:18:59,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:59,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 10:19:04,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:19:05,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:07,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:19:07,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=333586.6666666667, ans=0.0 2023-09-29 10:19:08,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:11,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 10:19:11,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:13,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 10:19:14,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:14,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:19:16,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:17,294 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:19:18,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:19:20,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:20,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:20,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:21,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:19:21,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:19:22,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=333653.3333333333, ans=0.125 2023-09-29 10:19:23,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:19:26,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:19:26,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:19:30,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:19:30,242 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 10:19:34,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:19:34,534 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 10:19:34,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=333653.3333333333, ans=0.125 2023-09-29 10:19:36,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:19:36,094 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 10:19:37,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:39,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:19:40,601 INFO [train.py:1039] (2/4) Epoch 10, batch 2250, loss[loss=0.184, simple_loss=0.2612, pruned_loss=0.05335, over 24573.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2751, pruned_loss=0.06929, over 4712564.06 frames. ], batch size: 60, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:19:40,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:42,261 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 10:19:43,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:19:45,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:19:52,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:19:53,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:19:57,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=333786.6666666667, ans=0.0 2023-09-29 10:19:58,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:19:58,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:19:59,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:20:01,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 10:20:01,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:02,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:20:06,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 10:20:07,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:20:07,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:09,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:20:12,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:14,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:20:14,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:20:14,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=333853.3333333333, ans=0.125 2023-09-29 10:20:17,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 10:20:18,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:20,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:20:25,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:26,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-09-29 10:20:27,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:28,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:20:28,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:31,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:31,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=333920.0, ans=0.125 2023-09-29 10:20:34,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:20:38,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:20:39,528 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.74 vs. limit=15.0 2023-09-29 10:20:40,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:20:40,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=333920.0, ans=0.0 2023-09-29 10:20:45,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:20:45,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:20:46,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:20:51,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:20:51,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=333986.6666666667, ans=0.2 2023-09-29 10:20:51,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=333986.6666666667, ans=0.07 2023-09-29 10:20:54,189 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.009e+02 2.238e+02 2.511e+02 3.719e+02, threshold=4.476e+02, percent-clipped=0.0 2023-09-29 10:20:54,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:20:54,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 10:20:54,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:20:55,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:20:59,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 10:21:02,411 INFO [train.py:1039] (2/4) Epoch 10, batch 2300, loss[loss=0.2126, simple_loss=0.2727, pruned_loss=0.07631, over 23711.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2758, pruned_loss=0.06971, over 4725867.26 frames. ], batch size: 232, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:21:02,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:21:02,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:07,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=334053.3333333333, ans=0.5 2023-09-29 10:21:09,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:09,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:21:09,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=334053.3333333333, ans=0.0 2023-09-29 10:21:11,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=334053.3333333333, ans=6.0 2023-09-29 10:21:12,962 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 10:21:16,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:22,594 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.46 vs. limit=6.0 2023-09-29 10:21:25,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:21:25,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:21:25,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:21:26,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:26,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 10:21:28,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:21:31,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:31,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:21:34,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:21:36,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:21:39,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:21:46,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:21:46,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:50,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:21:53,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:54,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:56,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:21:56,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:21:56,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 10:21:57,261 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-09-29 10:22:00,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:22:00,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:01,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:02,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:22:02,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:04,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:22:04,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:22:05,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 10:22:05,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:22:05,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:05,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 10:22:12,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:22:15,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:22:15,968 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:22:21,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=334320.0, ans=0.125 2023-09-29 10:22:22,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:22,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:22:22,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:22:24,211 INFO [train.py:1039] (2/4) Epoch 10, batch 2350, loss[loss=0.2286, simple_loss=0.3034, pruned_loss=0.07691, over 24087.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2763, pruned_loss=0.0696, over 4728919.15 frames. ], batch size: 80, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:22:24,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:22:24,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:22:24,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:22:25,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 10:22:26,854 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.76 vs. limit=12.0 2023-09-29 10:22:32,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:22:32,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 10:22:32,923 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.44 vs. limit=15.0 2023-09-29 10:22:37,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=334386.6666666667, ans=0.07 2023-09-29 10:22:38,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 10:22:41,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:46,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:22:47,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:22:48,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 10:22:52,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:22:56,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=334520.0, ans=0.0 2023-09-29 10:22:57,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 10:23:00,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:23:01,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:23:01,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:23:04,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:23:06,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 10:23:06,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:23:08,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:23:08,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:08,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:23:13,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:23:15,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 10:23:15,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:23:17,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:23:17,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:23:20,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 10:23:22,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:23:24,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=334586.6666666667, ans=0.0 2023-09-29 10:23:25,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 10:23:25,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:23:30,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 10:23:35,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 10:23:36,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:36,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 10:23:36,774 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 10:23:36,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 10:23:37,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=334653.3333333333, ans=0.125 2023-09-29 10:23:38,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 10:23:39,746 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 2.129e+02 2.355e+02 2.700e+02 4.237e+02, threshold=4.711e+02, percent-clipped=0.0 2023-09-29 10:23:41,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:23:41,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=334653.3333333333, ans=0.125 2023-09-29 10:23:44,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:23:46,757 INFO [train.py:1039] (2/4) Epoch 10, batch 2400, loss[loss=0.201, simple_loss=0.2604, pruned_loss=0.07077, over 23708.00 frames. ], tot_loss[loss=0.2079, simple_loss=0.2763, pruned_loss=0.06968, over 4730483.12 frames. ], batch size: 149, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:23:49,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:23:49,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=334720.0, ans=0.1 2023-09-29 10:23:50,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:23:50,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 10:23:52,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 10:23:59,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:23:59,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:01,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 10:24:02,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:24:04,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:04,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 10:24:04,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=334786.6666666667, ans=0.07 2023-09-29 10:24:07,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=334786.6666666667, ans=0.0 2023-09-29 10:24:07,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=334786.6666666667, ans=0.0 2023-09-29 10:24:08,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:11,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 10:24:17,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:24:20,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=334853.3333333333, ans=0.0 2023-09-29 10:24:23,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 10:24:25,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.00 vs. limit=22.5 2023-09-29 10:24:26,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:24:28,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:33,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:33,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 10:24:33,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:24:35,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=334920.0, ans=0.1 2023-09-29 10:24:38,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=334920.0, ans=0.125 2023-09-29 10:24:40,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=334920.0, ans=0.125 2023-09-29 10:24:41,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:43,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:24:46,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:47,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:24:47,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:24:47,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:24:49,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:49,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:24:49,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:24:52,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:24:52,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:24:54,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 10:24:56,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 10:24:57,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:57,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:57,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 10:24:57,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 10:24:57,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 10:24:57,891 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 10:25:00,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 10:25:00,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:25:04,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:04,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:06,428 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 10:25:06,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:07,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:25:09,563 INFO [train.py:1039] (2/4) Epoch 10, batch 2450, loss[loss=0.2057, simple_loss=0.2919, pruned_loss=0.05973, over 24537.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2752, pruned_loss=0.06891, over 4731418.60 frames. ], batch size: 71, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:25:11,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:25:11,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:15,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:15,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:17,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 10:25:23,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:25:23,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:27,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:25:27,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:25:27,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:25:27,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 10:25:27,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=335120.0, ans=0.1 2023-09-29 10:25:27,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=335120.0, ans=0.0 2023-09-29 10:25:32,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:33,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:25:34,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:25:39,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:25:39,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:42,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 10:25:43,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:25:44,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=335186.6666666667, ans=0.1 2023-09-29 10:25:51,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.66 vs. limit=10.0 2023-09-29 10:25:51,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:53,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:53,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:25:53,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:25:53,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:54,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:25:56,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 10:26:01,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:26:01,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:26:05,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:05,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:10,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:26:10,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 10:26:11,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:26:12,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:12,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 10:26:12,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:26:15,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:26:20,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:26:21,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:26:21,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:26:24,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.27 vs. limit=15.0 2023-09-29 10:26:24,596 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.022e+02 2.367e+02 2.913e+02 5.353e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 10:26:26,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 10:26:26,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=335320.0, ans=0.125 2023-09-29 10:26:27,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:26:28,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=335320.0, ans=0.0 2023-09-29 10:26:30,831 INFO [train.py:1039] (2/4) Epoch 10, batch 2500, loss[loss=0.1867, simple_loss=0.2697, pruned_loss=0.05189, over 24646.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2735, pruned_loss=0.06833, over 4714220.30 frames. ], batch size: 68, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:26:33,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:43,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:26:43,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:44,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.68 vs. limit=15.0 2023-09-29 10:26:45,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:45,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 10:26:52,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:26:53,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:54,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:26:54,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:26:54,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 10:26:56,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:57,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:59,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 10:26:59,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:59,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 10:27:00,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:02,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=335520.0, ans=0.0 2023-09-29 10:27:06,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:27:07,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:27:11,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:27:11,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 10:27:13,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:15,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:19,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:22,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=335586.6666666667, ans=0.125 2023-09-29 10:27:25,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:29,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:33,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:27:34,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=335586.6666666667, ans=0.125 2023-09-29 10:27:36,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 10:27:38,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:27:38,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:27:41,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:27:41,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:27:41,453 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 10:27:41,453 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 10:27:42,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 10:27:44,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:46,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 10:27:46,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 10:27:48,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:48,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 10:27:52,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 10:27:53,796 INFO [train.py:1039] (2/4) Epoch 10, batch 2550, loss[loss=0.2123, simple_loss=0.2809, pruned_loss=0.07189, over 24340.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2743, pruned_loss=0.06857, over 4697382.48 frames. ], batch size: 61, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:27:56,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:27:57,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:58,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:28:00,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:28:00,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=335720.0, ans=0.125 2023-09-29 10:28:02,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 10:28:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:28:06,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 10:28:08,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:28:09,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:12,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:28:12,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 10:28:12,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:14,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:14,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:15,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.53 vs. limit=22.5 2023-09-29 10:28:16,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:28:16,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 10:28:17,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:28:17,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:17,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 10:28:33,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:28:38,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:38,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:40,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:40,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:28:46,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:48,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:48,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:28:48,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:28:49,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:28:49,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:28:53,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:53,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:58,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:28:58,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 10:28:58,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:29:00,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:00,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:29:01,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:29:03,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:09,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:29:10,862 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.915e+02 2.103e+02 2.425e+02 3.393e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 10:29:11,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:14,174 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 10:29:17,117 INFO [train.py:1039] (2/4) Epoch 10, batch 2600, loss[loss=0.1935, simple_loss=0.2703, pruned_loss=0.05838, over 24481.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2759, pruned_loss=0.06928, over 4704049.11 frames. ], batch size: 63, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:29:17,198 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 10:29:17,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:29:17,287 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 10:29:18,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 10:29:18,865 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 10:29:20,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:29:21,892 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 10:29:23,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 10:29:25,447 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 10:29:26,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:29:28,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 10:29:30,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 10:29:32,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:29:32,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 10:29:35,413 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 10:29:35,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 10:29:42,493 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:29:43,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:43,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:43,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:29:43,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 10:29:45,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=336120.0, ans=0.2 2023-09-29 10:29:46,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:29:50,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=336186.6666666667, ans=0.125 2023-09-29 10:29:51,437 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 10:29:56,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.31 vs. limit=22.5 2023-09-29 10:29:56,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:56,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:58,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 10:29:58,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:58,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:30:00,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 10:30:00,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=336186.6666666667, ans=0.125 2023-09-29 10:30:03,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:30:03,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:30:05,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,782 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 10:30:09,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:30:16,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=336253.3333333333, ans=0.0 2023-09-29 10:30:16,903 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.71 vs. limit=15.0 2023-09-29 10:30:17,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:30:19,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:30:19,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 10:30:19,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:30:22,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:30:23,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:30,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 10:30:32,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:33,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:30:38,154 INFO [train.py:1039] (2/4) Epoch 10, batch 2650, loss[loss=0.1838, simple_loss=0.253, pruned_loss=0.05734, over 24327.00 frames. ], tot_loss[loss=0.2079, simple_loss=0.2766, pruned_loss=0.06957, over 4710790.08 frames. ], batch size: 56, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:30:40,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 10:30:40,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:41,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:30:41,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 10:30:42,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:30:45,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:48,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:30:50,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:52,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:55,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 10:30:55,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:30:55,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:30:58,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 10:30:58,802 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 10:31:01,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:05,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 10:31:05,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:06,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 10:31:08,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=336453.3333333333, ans=0.125 2023-09-29 10:31:11,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:31:11,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:16,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 10:31:16,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 10:31:22,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:31:24,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 10:31:24,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:25,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:26,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:26,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:26,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:28,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:30,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:30,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:31:31,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:31:33,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:31:34,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:35,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=336586.6666666667, ans=0.125 2023-09-29 10:31:36,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:31:39,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:41,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:41,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:31:45,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:45,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:31:45,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:46,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 10:31:50,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:53,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:54,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:56,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:31:57,509 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.958e+02 2.205e+02 2.606e+02 4.713e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-29 10:31:57,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:59,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:31:59,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=336720.0, ans=0.125 2023-09-29 10:31:59,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=336720.0, ans=0.1 2023-09-29 10:32:00,542 INFO [train.py:1039] (2/4) Epoch 10, batch 2700, loss[loss=0.1978, simple_loss=0.2758, pruned_loss=0.05985, over 24648.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2768, pruned_loss=0.06924, over 4723533.11 frames. ], batch size: 65, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:32:02,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:02,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 10:32:04,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:06,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:32:07,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:32:09,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:09,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:10,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:32:10,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:32:12,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:32:12,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:32:12,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 10:32:12,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:32:14,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:32:16,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:32:16,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:32:20,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:32:20,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 10:32:22,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:32:26,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:32:27,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:32:34,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:32:34,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:34,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:32:34,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:32:34,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=336853.3333333333, ans=0.125 2023-09-29 10:32:37,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:32:42,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:32:42,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:32:42,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:32:49,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:49,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:32:55,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=336920.0, ans=0.2 2023-09-29 10:32:58,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:58,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:01,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:33:01,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:05,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:06,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:06,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:33:07,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=336986.6666666667, ans=0.125 2023-09-29 10:33:07,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=336986.6666666667, ans=0.125 2023-09-29 10:33:09,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:12,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:12,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:13,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:33:15,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:15,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:19,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=336986.6666666667, ans=0.125 2023-09-29 10:33:20,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 10:33:21,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:22,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=337053.3333333333, ans=0.125 2023-09-29 10:33:23,837 INFO [train.py:1039] (2/4) Epoch 10, batch 2750, loss[loss=0.2093, simple_loss=0.255, pruned_loss=0.08178, over 19426.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2762, pruned_loss=0.06907, over 4726466.39 frames. ], batch size: 388, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:33:23,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:33:23,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 10:33:25,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 10:33:25,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:26,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=337053.3333333333, ans=0.1 2023-09-29 10:33:28,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:28,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:30,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=337053.3333333333, ans=0.09899494936611666 2023-09-29 10:33:32,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:32,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:33:33,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:36,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:33:36,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:33:38,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:33:38,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:38,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 10:33:38,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:33:38,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:43,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 10:33:46,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:47,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.94 vs. limit=15.0 2023-09-29 10:33:48,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:48,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:48,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:33:49,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:51,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:33:51,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:53,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:57,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:33:57,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:33:57,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:33:58,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:58,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:33:59,568 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.34 vs. limit=6.0 2023-09-29 10:34:06,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:34:08,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:34:08,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:13,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:34:13,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:34:14,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:34:21,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:34:21,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=337253.3333333333, ans=0.1 2023-09-29 10:34:22,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:34:22,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 10:34:26,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:29,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys.whitening_limit, batch_count=337320.0, ans=6.0 2023-09-29 10:34:30,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 10:34:34,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:34:37,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:34:37,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 10:34:38,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:34:41,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:34:41,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 10:34:41,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:34:42,614 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.063e+02 2.307e+02 2.543e+02 4.120e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 10:34:46,146 INFO [train.py:1039] (2/4) Epoch 10, batch 2800, loss[loss=0.201, simple_loss=0.2571, pruned_loss=0.0724, over 23816.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2754, pruned_loss=0.06902, over 4726617.24 frames. ], batch size: 195, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:34:46,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:34:46,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:34:48,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:34:50,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 10:34:50,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:50,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:51,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:51,913 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 10:34:51,914 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 10:34:56,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:59,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:34:59,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:35:03,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:35:06,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 10:35:07,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=337453.3333333333, ans=0.125 2023-09-29 10:35:08,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:35:08,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 10:35:10,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:11,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:35:11,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:16,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:16,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:16,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:35:17,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:18,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=337520.0, ans=0.1 2023-09-29 10:35:26,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:35:28,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:35:29,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:31,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:35:32,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:39,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:39,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 10:35:40,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:40,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:40,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:35:41,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=337586.6666666667, ans=0.0 2023-09-29 10:35:46,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:47,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:51,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:53,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:35:53,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:53,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:35:53,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=337653.3333333333, ans=0.125 2023-09-29 10:35:54,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:35:54,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:35:55,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=337653.3333333333, ans=0.0 2023-09-29 10:35:55,609 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.59 vs. limit=6.0 2023-09-29 10:35:57,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:57,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 10:35:57,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:35:58,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:58,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:00,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 10:36:01,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.32 vs. limit=10.0 2023-09-29 10:36:02,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:02,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:36:03,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:36:05,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 10:36:10,394 INFO [train.py:1039] (2/4) Epoch 10, batch 2850, loss[loss=0.2066, simple_loss=0.2888, pruned_loss=0.06218, over 24550.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2739, pruned_loss=0.06807, over 4726523.93 frames. ], batch size: 71, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:36:12,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:36:12,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:36:13,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:36:15,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:18,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:20,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:36:20,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:36:23,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:23,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:36:25,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:36:26,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 10:36:31,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 10:36:31,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:34,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 10:36:35,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:38,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 10:36:38,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 10:36:40,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:51,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=337853.3333333333, ans=0.125 2023-09-29 10:36:55,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:55,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=337853.3333333333, ans=0.1 2023-09-29 10:36:56,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:36:56,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:56,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:36:56,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:36:58,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:37:00,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:37:00,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 10:37:01,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:37:01,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:03,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:03,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=337920.0, ans=0.125 2023-09-29 10:37:04,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:06,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:06,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:09,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:11,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:37:14,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:37:16,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:16,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:16,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=337986.6666666667, ans=0.125 2023-09-29 10:37:18,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:37:24,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:37:25,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 10:37:25,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 10:37:25,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=337986.6666666667, ans=0.0 2023-09-29 10:37:27,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:37:28,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:28,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 10:37:28,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:37:29,077 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:37:30,085 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.039e+02 2.270e+02 2.590e+02 3.840e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-29 10:37:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:30,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:32,250 INFO [train.py:1039] (2/4) Epoch 10, batch 2900, loss[loss=0.2116, simple_loss=0.278, pruned_loss=0.07259, over 23714.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2736, pruned_loss=0.06879, over 4701373.70 frames. ], batch size: 149, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:37:32,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:37:32,320 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 10:37:32,377 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 10:37:32,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:32,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:36,676 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.38 vs. limit=15.0 2023-09-29 10:37:37,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:37:37,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:37,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:38,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 10:37:43,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:43,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 10:37:45,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 10:37:47,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:37:47,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:37:49,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:51,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:54,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=338120.0, ans=0.2 2023-09-29 10:37:55,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:55,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:57,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:37:57,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=338120.0, ans=0.0 2023-09-29 10:37:58,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 10:37:58,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:38:00,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:03,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 10:38:05,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 10:38:07,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:38:07,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 10:38:07,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:38:08,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:38:08,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:38:11,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:38:13,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:15,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:38:18,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:20,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 10:38:20,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 10:38:20,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:38:25,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:38:27,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=338253.3333333333, ans=0.0 2023-09-29 10:38:28,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 10:38:28,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:38:33,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=338253.3333333333, ans=0.125 2023-09-29 10:38:34,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:44,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:38:44,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:38:45,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 10:38:49,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:49,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 10:38:49,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:50,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:38:55,100 INFO [train.py:1039] (2/4) Epoch 10, batch 2950, loss[loss=0.2192, simple_loss=0.2937, pruned_loss=0.07235, over 23788.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2749, pruned_loss=0.06869, over 4715395.77 frames. ], batch size: 85, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:38:55,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:58,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 10:38:58,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:38:58,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:01,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:02,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:39:04,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 10:39:04,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 10:39:04,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:39:04,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:39:09,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=338453.3333333333, ans=0.125 2023-09-29 10:39:13,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:14,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:16,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:39:16,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:19,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:39:19,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:39:21,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:21,760 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.41 vs. limit=22.5 2023-09-29 10:39:22,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:24,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:39:25,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 10:39:31,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 10:39:31,307 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 10:39:32,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:39:34,392 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 10:39:35,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 10:39:35,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:36,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=338520.0, ans=0.1 2023-09-29 10:39:37,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:37,448 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 10:39:37,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:39:40,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 10:39:41,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:42,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:39:46,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:47,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:39:47,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:49,054 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 10:39:50,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:50,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 10:39:54,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=15.0 2023-09-29 10:39:56,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:58,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:39:58,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=338653.3333333333, ans=0.2 2023-09-29 10:39:59,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 10:39:59,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:40:03,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 10:40:04,586 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.33 vs. limit=22.5 2023-09-29 10:40:07,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:08,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:40:08,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:40:10,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:40:10,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:40:13,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:40:13,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:13,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:40:13,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:40:15,014 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.713e+02 2.095e+02 2.360e+02 2.809e+02 3.974e+02, threshold=4.720e+02, percent-clipped=0.0 2023-09-29 10:40:15,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:16,514 INFO [train.py:1039] (2/4) Epoch 10, batch 3000, loss[loss=0.1952, simple_loss=0.2746, pruned_loss=0.05791, over 24530.00 frames. ], tot_loss[loss=0.2061, simple_loss=0.2754, pruned_loss=0.0684, over 4720850.21 frames. ], batch size: 63, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:40:16,514 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 10:40:31,293 INFO [train.py:1071] (2/4) Epoch 10, validation: loss=0.2858, simple_loss=0.2843, pruned_loss=0.1436, over 1125622.00 frames. 2023-09-29 10:40:31,294 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 10:40:31,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:40:33,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:33,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 10:40:34,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:38,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:40:38,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:40:43,311 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 10:40:43,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 10:40:46,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:40:46,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:40:47,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 10:40:49,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:40:55,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:41:02,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=338853.3333333333, ans=0.125 2023-09-29 10:41:06,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:41:14,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 10:41:16,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:41:19,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:41:19,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:41:19,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:41:21,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:21,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 10:41:22,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 10:41:24,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:41:24,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:41:24,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=338920.0, ans=0.125 2023-09-29 10:41:28,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:41:28,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:28,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:28,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:41:35,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:41:35,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:35,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:41:36,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:39,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 10:41:40,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:41:40,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:41:40,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:41:42,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=338986.6666666667, ans=0.1 2023-09-29 10:41:45,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:45,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:47,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:41:47,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 10:41:47,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:41:47,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=338986.6666666667, ans=0.2 2023-09-29 10:41:49,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 10:41:49,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:41:52,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 10:41:54,024 INFO [train.py:1039] (2/4) Epoch 10, batch 3050, loss[loss=0.2197, simple_loss=0.2763, pruned_loss=0.08153, over 22714.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2761, pruned_loss=0.06865, over 4720296.83 frames. ], batch size: 322, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:41:54,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:41:54,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:41:54,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 10:41:55,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 10:41:55,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:41:57,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:41:57,902 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.40 vs. limit=22.5 2023-09-29 10:41:58,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:58,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:41:58,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:00,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:42:01,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 10:42:05,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:06,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.36 vs. limit=15.0 2023-09-29 10:42:07,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:07,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:42:10,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:11,002 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.11 vs. limit=22.5 2023-09-29 10:42:15,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 10:42:22,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 10:42:22,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 10:42:22,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:27,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:42:30,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:30,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:31,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:34,147 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.57 vs. limit=12.0 2023-09-29 10:42:34,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:42:34,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:42:34,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:36,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:36,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:37,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:38,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:40,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:41,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 10:42:43,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:43,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:42:46,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:46,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:42:48,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:42:48,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:42:53,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:55,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:01,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=339320.0, ans=0.125 2023-09-29 10:43:02,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:02,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:02,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:03,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:05,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:43:05,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:43:05,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=339320.0, ans=0.0 2023-09-29 10:43:07,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 10:43:09,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:09,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:10,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 10:43:12,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:15,089 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.990e+02 2.334e+02 2.687e+02 4.208e+02, threshold=4.668e+02, percent-clipped=0.0 2023-09-29 10:43:16,558 INFO [train.py:1039] (2/4) Epoch 10, batch 3100, loss[loss=0.239, simple_loss=0.2855, pruned_loss=0.09627, over 19846.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2762, pruned_loss=0.06968, over 4700065.96 frames. ], batch size: 388, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:43:18,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:19,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:43:21,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:43:21,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=339386.6666666667, ans=0.0 2023-09-29 10:43:25,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 10:43:28,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 10:43:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 10:43:29,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:43:31,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=339453.3333333333, ans=0.0 2023-09-29 10:43:34,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:43:34,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:36,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:43:38,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=339453.3333333333, ans=0.125 2023-09-29 10:43:40,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=339453.3333333333, ans=10.0 2023-09-29 10:43:41,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:47,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 10:43:50,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:43:50,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:52,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:43:52,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:53,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:43:55,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:43:55,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 10:43:55,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:43:56,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:57,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.72 vs. limit=15.0 2023-09-29 10:43:58,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 10:44:00,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:04,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:44:05,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 10:44:05,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=339586.6666666667, ans=0.0 2023-09-29 10:44:07,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 10:44:08,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:08,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:44:10,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:10,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:10,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:44:12,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:44:12,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:44:14,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:44:15,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:44:15,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:15,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 10:44:18,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:44:20,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 10:44:20,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=339653.3333333333, ans=0.0 2023-09-29 10:44:23,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:44:23,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 10:44:24,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=339653.3333333333, ans=0.07 2023-09-29 10:44:25,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:25,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:25,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 10:44:37,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 10:44:38,895 INFO [train.py:1039] (2/4) Epoch 10, batch 3150, loss[loss=0.2031, simple_loss=0.2647, pruned_loss=0.07079, over 23932.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2751, pruned_loss=0.06937, over 4690183.69 frames. ], batch size: 195, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:44:40,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=339720.0, ans=0.2 2023-09-29 10:44:41,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:41,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:43,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:44:43,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:44:45,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 10:44:45,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:47,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:44:48,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 10:44:48,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:51,722 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 10:44:54,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 10:44:56,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:56,341 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 10:44:57,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:44:58,475 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.23 vs. limit=6.0 2023-09-29 10:44:59,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 10:44:59,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 10:44:59,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 10:44:59,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:59,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:00,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:45:04,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 10:45:06,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:06,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:06,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:09,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:45:13,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 10:45:14,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:45:16,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:45:16,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:17,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 10:45:22,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 10:45:22,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:45:24,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:45:24,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:45:24,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:24,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:45:25,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:45:25,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:45:27,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 10:45:27,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:45:27,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:28,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:45:30,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:31,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 10:45:31,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:33,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 10:45:34,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:34,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 10:45:37,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 10:45:38,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:45:40,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:40,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 10:45:42,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:45:43,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:47,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:45:49,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:45:52,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=339986.6666666667, ans=0.95 2023-09-29 10:45:55,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:45:55,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:57,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:45:58,674 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 1.980e+02 2.239e+02 2.536e+02 3.813e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 10:45:59,706 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.00 vs. limit=15.0 2023-09-29 10:46:00,293 INFO [train.py:1039] (2/4) Epoch 10, batch 3200, loss[loss=0.1816, simple_loss=0.2505, pruned_loss=0.0563, over 21388.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.2739, pruned_loss=0.06854, over 4683104.75 frames. ], batch size: 47, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:46:00,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=340053.3333333333, ans=0.125 2023-09-29 10:46:02,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=340053.3333333333, ans=0.025 2023-09-29 10:46:03,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=340053.3333333333, ans=0.125 2023-09-29 10:46:04,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:46:05,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:46:07,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=340053.3333333333, ans=0.125 2023-09-29 10:46:08,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:10,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:46:10,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 10:46:11,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:46:17,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:46:21,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:26,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=340120.0, ans=0.125 2023-09-29 10:46:31,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:46:42,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 10:46:42,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:46:44,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 10:46:45,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:46:49,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:46:49,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:46:51,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:46:55,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=340253.3333333333, ans=0.0 2023-09-29 10:46:56,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 10:46:58,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:46:59,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 10:47:01,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 10:47:02,678 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.97 vs. limit=6.0 2023-09-29 10:47:04,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:47:09,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:10,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:47:11,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:12,482 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 10:47:12,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:47:15,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:18,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 10:47:18,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 10:47:19,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.28 vs. limit=6.0 2023-09-29 10:47:20,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 10:47:21,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 10:47:23,682 INFO [train.py:1039] (2/4) Epoch 10, batch 3250, loss[loss=0.195, simple_loss=0.2728, pruned_loss=0.05858, over 24130.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2735, pruned_loss=0.06881, over 4676392.64 frames. ], batch size: 80, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:47:23,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:47:29,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:47:29,620 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 10:47:29,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:47:29,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:29,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=340386.6666666667, ans=0.0 2023-09-29 10:47:32,542 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 10:47:32,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=340386.6666666667, ans=0.125 2023-09-29 10:47:37,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:47:41,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:47:41,395 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:47:47,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:47:47,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 10:47:48,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:48,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:48,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:47:50,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:47:50,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:47:53,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:54,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:47:56,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:47:56,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:47:59,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:00,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:48:03,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:03,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:48:04,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:05,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:48:05,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:11,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 10:48:11,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:48:11,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:48:12,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:14,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:48:19,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:48:28,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:29,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:29,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 10:48:29,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:48:29,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:48:29,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:31,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 10:48:31,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 10:48:33,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:34,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:36,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:36,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:48:37,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:42,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:48:42,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:44,368 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 2.128e+02 2.401e+02 2.998e+02 4.766e+02, threshold=4.802e+02, percent-clipped=1.0 2023-09-29 10:48:44,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 10:48:44,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:48:46,475 INFO [train.py:1039] (2/4) Epoch 10, batch 3300, loss[loss=0.2155, simple_loss=0.2968, pruned_loss=0.06714, over 24442.00 frames. ], tot_loss[loss=0.2063, simple_loss=0.2747, pruned_loss=0.06893, over 4681919.17 frames. ], batch size: 69, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:48:46,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:48:46,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 10:48:49,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:51,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 10:48:52,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 10:48:52,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 10:48:53,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:53,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=340720.0, ans=0.2 2023-09-29 10:48:56,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:57,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:48:57,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:59,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:49:00,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:49:03,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:05,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:10,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 10:49:12,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:12,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:14,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:15,755 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 10:49:17,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:49:17,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:49:19,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:49:19,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:49:20,918 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 10:49:22,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:22,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:49:24,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:24,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 10:49:25,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 10:49:25,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:27,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:49:30,398 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 10:49:33,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 10:49:33,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:49:35,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 10:49:38,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:49:39,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:49:41,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:49:45,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:45,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:45,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:45,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:49:48,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:49:48,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:49,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:49:49,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=340920.0, ans=0.125 2023-09-29 10:49:49,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=340920.0, ans=0.125 2023-09-29 10:49:50,611 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 10:49:52,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 10:49:55,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:49:55,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=340986.6666666667, ans=0.09899494936611666 2023-09-29 10:49:57,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:57,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:49:58,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:58,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:00,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:50:00,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:00,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:50:00,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:03,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:50:05,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 10:50:05,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:06,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:08,233 INFO [train.py:1039] (2/4) Epoch 10, batch 3350, loss[loss=0.2984, simple_loss=0.3352, pruned_loss=0.1308, over 19296.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2753, pruned_loss=0.0687, over 4693566.78 frames. ], batch size: 388, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:50:08,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:50:08,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:50:09,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:11,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:11,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:15,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:50:17,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:19,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:50:23,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:23,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:50:26,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:27,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:50:29,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 10:50:29,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=341120.0, ans=0.0 2023-09-29 10:50:30,643 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 10:50:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:32,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=341120.0, ans=0.125 2023-09-29 10:50:35,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 10:50:35,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 10:50:35,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:50:35,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:50:38,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:50:38,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 10:50:38,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:38,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:50:41,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:42,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:44,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:45,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:50:49,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:52,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:52,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:56,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:56,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:00,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:00,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:01,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:05,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 10:51:05,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:51:05,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 10:51:05,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:51:06,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 10:51:08,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:09,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:10,540 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.49 vs. limit=6.0 2023-09-29 10:51:16,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:17,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 10:51:18,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:20,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:51:22,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:51:28,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:30,303 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.041e+02 2.250e+02 2.635e+02 4.628e+02, threshold=4.499e+02, percent-clipped=0.0 2023-09-29 10:51:30,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 10:51:31,823 INFO [train.py:1039] (2/4) Epoch 10, batch 3400, loss[loss=0.2947, simple_loss=0.3344, pruned_loss=0.1275, over 19567.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2766, pruned_loss=0.06969, over 4691015.86 frames. ], batch size: 388, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:51:31,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:51:31,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:51:33,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:34,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=341386.6666666667, ans=0.125 2023-09-29 10:51:35,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 10:51:37,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:37,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 10:51:38,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:40,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:40,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:51:41,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:51:41,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 10:51:43,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 10:51:43,875 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 10:51:45,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:50,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:50,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:51,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:51:52,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:51:59,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:51:59,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 10:51:59,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=341453.3333333333, ans=0.2 2023-09-29 10:52:04,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:52:08,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:09,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:11,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:52:16,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:52:19,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 10:52:24,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 10:52:25,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:52:27,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:27,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:52:27,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=341586.6666666667, ans=0.125 2023-09-29 10:52:28,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:52:32,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:35,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:52:35,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:52:39,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=341653.3333333333, ans=0.05 2023-09-29 10:52:39,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=341653.3333333333, ans=0.1 2023-09-29 10:52:42,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:52:45,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 10:52:45,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=341653.3333333333, ans=0.0 2023-09-29 10:52:51,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:52:53,942 INFO [train.py:1039] (2/4) Epoch 10, batch 3450, loss[loss=0.1962, simple_loss=0.2787, pruned_loss=0.05684, over 24554.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2763, pruned_loss=0.06909, over 4707435.42 frames. ], batch size: 71, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:52:55,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 10:53:00,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 10:53:00,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:01,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:53:01,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 10:53:03,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:53:05,304 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=12.25 vs. limit=15.0 2023-09-29 10:53:06,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:53:10,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:53:12,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:13,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:53:14,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:16,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:24,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 10:53:28,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 10:53:28,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:53:28,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:53:30,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:36,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 10:53:37,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:53:41,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:53:41,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:43,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:53:45,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:53:46,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 10:53:46,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:53:50,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:52,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:53:55,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 10:53:59,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:54:01,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=341986.6666666667, ans=0.2 2023-09-29 10:54:04,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:54:04,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=341986.6666666667, ans=0.125 2023-09-29 10:54:06,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:09,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:12,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:12,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:54:12,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:54:12,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:54:16,391 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 2.010e+02 2.253e+02 2.520e+02 3.608e+02, threshold=4.507e+02, percent-clipped=0.0 2023-09-29 10:54:16,434 INFO [train.py:1039] (2/4) Epoch 10, batch 3500, loss[loss=0.2287, simple_loss=0.2916, pruned_loss=0.08293, over 23282.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2751, pruned_loss=0.0689, over 4702320.35 frames. ], batch size: 105, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:54:18,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:21,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:54:23,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 10:54:25,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:54:28,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 10:54:31,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:31,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 10:54:33,920 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.84 vs. limit=15.0 2023-09-29 10:54:35,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:54:36,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:54:37,127 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.04 vs. limit=6.0 2023-09-29 10:54:38,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:54:38,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:54:38,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:54:39,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:39,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:39,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 10:54:42,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:44,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:54:44,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=342120.0, ans=0.125 2023-09-29 10:54:46,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:51,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:51,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 10:54:51,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:54,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:55,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:54:58,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:59,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:54:59,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:01,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 10:55:01,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 10:55:02,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 10:55:04,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:05,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:05,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:07,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:55:10,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:55:10,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:55:15,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:16,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 10:55:16,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 10:55:16,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:55:20,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:20,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:22,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:25,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 10:55:26,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:26,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:29,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 10:55:30,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 10:55:33,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:35,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:35,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:35,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:38,067 INFO [train.py:1039] (2/4) Epoch 10, batch 3550, loss[loss=0.1732, simple_loss=0.2501, pruned_loss=0.04819, over 24317.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.274, pruned_loss=0.06857, over 4706635.32 frames. ], batch size: 61, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:55:39,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:55:41,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=342386.6666666667, ans=0.125 2023-09-29 10:55:44,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:47,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:55:50,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:50,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=342386.6666666667, ans=0.0 2023-09-29 10:55:52,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:55:55,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:55,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:55:56,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:55:57,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=342453.3333333333, ans=0.05 2023-09-29 10:55:57,381 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=10.63 vs. limit=12.0 2023-09-29 10:56:00,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:00,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:56:00,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:00,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:56:02,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:56:05,065 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.95 vs. limit=15.0 2023-09-29 10:56:08,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:56:08,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:10,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:10,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:11,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:56:11,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 10:56:11,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:13,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:14,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:56:19,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:19,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:56:21,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:24,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 10:56:24,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:56:26,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 10:56:27,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:30,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:56:30,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:56:30,461 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:56:30,889 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.55 vs. limit=15.0 2023-09-29 10:56:33,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 10:56:35,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:40,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:42,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 10:56:42,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:47,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:48,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 10:56:55,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 10:56:55,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:56:55,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:56:56,321 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.61 vs. limit=15.0 2023-09-29 10:56:57,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:57:00,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=342720.0, ans=0.125 2023-09-29 10:57:02,292 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.051e+02 2.303e+02 2.669e+02 4.347e+02, threshold=4.606e+02, percent-clipped=0.0 2023-09-29 10:57:02,333 INFO [train.py:1039] (2/4) Epoch 10, batch 3600, loss[loss=0.2118, simple_loss=0.2928, pruned_loss=0.06544, over 24098.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2738, pruned_loss=0.06825, over 4697999.39 frames. ], batch size: 80, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:57:03,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.66 vs. limit=12.0 2023-09-29 10:57:04,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:05,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:07,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:57:07,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:57:09,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:09,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 10:57:14,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:57:15,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:18,973 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:57:20,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:20,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=342786.6666666667, ans=0.02 2023-09-29 10:57:23,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:24,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:57:24,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:26,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 10:57:26,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:28,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:29,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:57:30,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=342786.6666666667, ans=15.0 2023-09-29 10:57:31,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:32,893 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.44 vs. limit=15.0 2023-09-29 10:57:33,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:33,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:57:36,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 10:57:40,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.84 vs. limit=10.0 2023-09-29 10:57:43,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:57:45,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:57:45,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 10:57:49,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:57:53,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:56,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:58:01,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:58:01,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:58:01,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 10:58:02,250 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.68 vs. limit=15.0 2023-09-29 10:58:05,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 10:58:05,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 10:58:08,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:58:08,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:58:10,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 10:58:11,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:11,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:58:11,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:13,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 10:58:14,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 10:58:18,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:58:18,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 10:58:25,481 INFO [train.py:1039] (2/4) Epoch 10, batch 3650, loss[loss=0.2202, simple_loss=0.2953, pruned_loss=0.07254, over 24561.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2744, pruned_loss=0.06813, over 4707670.75 frames. ], batch size: 71, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:58:25,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 10:58:27,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:58:27,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=343053.3333333333, ans=0.125 2023-09-29 10:58:30,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 10:58:33,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 10:58:36,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:58:36,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:58:36,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:58:41,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:58:41,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:42,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 10:58:44,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:58:45,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:45,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 10:58:46,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:58:46,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=343120.0, ans=0.0 2023-09-29 10:58:48,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:58:48,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:58:49,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:58:52,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 10:58:55,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 10:58:55,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:58:56,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 10:58:59,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:58:59,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:59:02,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:59:04,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:04,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:59:06,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:59:06,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:59:07,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:59:12,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:14,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:14,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:59:17,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:59:19,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:19,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:22,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=343253.3333333333, ans=0.0 2023-09-29 10:59:26,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 10:59:28,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=343253.3333333333, ans=0.125 2023-09-29 10:59:29,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:29,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:31,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:59:31,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:33,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:59:35,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:36,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=343320.0, ans=0.125 2023-09-29 10:59:37,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 10:59:37,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:39,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=343320.0, ans=0.125 2023-09-29 10:59:40,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:59:41,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.38 vs. limit=22.5 2023-09-29 10:59:43,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:43,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:59:47,313 INFO [train.py:1039] (2/4) Epoch 10, batch 3700, loss[loss=0.204, simple_loss=0.2826, pruned_loss=0.06272, over 24454.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2749, pruned_loss=0.06824, over 4715284.18 frames. ], batch size: 69, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:59:47,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:47,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 10:59:47,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:48,843 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.980e+02 2.218e+02 2.465e+02 4.377e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 10:59:48,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:59:49,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:59:53,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:59:55,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=343386.6666666667, ans=0.125 2023-09-29 10:59:57,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:57,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=343386.6666666667, ans=0.125 2023-09-29 10:59:57,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343386.6666666667, ans=0.1 2023-09-29 10:59:58,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:59:58,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:59:58,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:00:00,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:00:02,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:04,325 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 11:00:13,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.68 vs. limit=15.0 2023-09-29 11:00:13,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:00:15,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:00:17,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:00:17,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 11:00:17,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:20,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:20,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 11:00:22,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:22,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343520.0, ans=0.1 2023-09-29 11:00:23,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:00:25,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:25,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:00:28,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:00:32,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:32,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 11:00:34,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:34,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 11:00:39,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=343586.6666666667, ans=0.2 2023-09-29 11:00:40,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:00:40,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:00:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:44,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 11:00:44,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=343586.6666666667, ans=0.125 2023-09-29 11:00:45,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:00:45,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:00:47,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:47,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:50,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:52,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 11:00:53,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 11:00:55,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:00:55,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:00:57,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:00:58,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:01:01,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:01:03,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:01:04,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:07,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 11:01:07,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=343653.3333333333, ans=0.125 2023-09-29 11:01:08,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:01:10,105 INFO [train.py:1039] (2/4) Epoch 10, batch 3750, loss[loss=0.2163, simple_loss=0.2759, pruned_loss=0.07839, over 23634.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.276, pruned_loss=0.06875, over 4721899.98 frames. ], batch size: 256, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:01:11,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:01:11,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 11:01:13,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:01:15,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:01:22,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:26,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:01:28,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:01:30,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:01:31,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:33,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 11:01:34,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:35,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:35,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:38,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 11:01:43,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 11:01:45,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:45,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:47,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:52,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:53,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=343853.3333333333, ans=0.125 2023-09-29 11:01:55,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:01:58,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 11:02:00,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=343920.0, ans=0.2 2023-09-29 11:02:02,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:08,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:02:08,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:02:11,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:02:15,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=343986.6666666667, ans=0.125 2023-09-29 11:02:15,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.45 vs. limit=15.0 2023-09-29 11:02:16,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:02:17,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:02:19,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=343986.6666666667, ans=0.125 2023-09-29 11:02:21,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:02:23,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:02:25,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:02:30,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=343986.6666666667, ans=0.2 2023-09-29 11:02:33,596 INFO [train.py:1039] (2/4) Epoch 10, batch 3800, loss[loss=0.2149, simple_loss=0.2937, pruned_loss=0.06804, over 23967.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2758, pruned_loss=0.06904, over 4715844.94 frames. ], batch size: 80, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:02:35,050 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.030e+02 2.447e+02 3.016e+02 6.033e+02, threshold=4.894e+02, percent-clipped=2.0 2023-09-29 11:02:35,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:02:38,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:38,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:02:40,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 11:02:40,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:43,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:02:44,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:02:45,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:02:45,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:46,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:02:48,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:48,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:02:50,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:02:50,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 11:02:53,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:02:53,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:02:58,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:02,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:03:03,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:03:05,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:03:05,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:08,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:11,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:14,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=344186.6666666667, ans=0.0 2023-09-29 11:03:16,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:03:16,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 11:03:17,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:23,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:30,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:03:33,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 11:03:34,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 11:03:36,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:37,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:39,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:39,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=344320.0, ans=0.1 2023-09-29 11:03:41,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 11:03:44,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 11:03:44,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 11:03:44,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:45,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:50,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:03:52,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:03:54,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=344386.6666666667, ans=0.125 2023-09-29 11:03:55,615 INFO [train.py:1039] (2/4) Epoch 10, batch 3850, loss[loss=0.199, simple_loss=0.2734, pruned_loss=0.06236, over 24311.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2752, pruned_loss=0.06922, over 4704702.89 frames. ], batch size: 61, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:03:58,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:04:00,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 11:04:02,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:04:03,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:05,068 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:04:07,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:04:10,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:12,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:04:12,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 11:04:17,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:19,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:21,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:21,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:04:21,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=344453.3333333333, ans=0.0 2023-09-29 11:04:25,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:25,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:04:25,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:25,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:04:27,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.10 vs. limit=15.0 2023-09-29 11:04:28,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:31,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:31,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:31,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:04:33,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 11:04:33,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 11:04:33,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=344520.0, ans=0.125 2023-09-29 11:04:35,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:35,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:40,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:40,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:40,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 11:04:40,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=344520.0, ans=0.125 2023-09-29 11:04:43,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 11:04:44,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:47,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 11:04:49,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:04:50,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=344586.6666666667, ans=0.125 2023-09-29 11:04:54,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:55,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:57,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=344586.6666666667, ans=0.125 2023-09-29 11:04:59,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:01,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 11:05:04,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 11:05:07,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:07,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:11,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:05:11,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:05:12,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:05:14,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 11:05:15,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:05:17,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 11:05:17,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:19,074 INFO [train.py:1039] (2/4) Epoch 10, batch 3900, loss[loss=0.1867, simple_loss=0.2613, pruned_loss=0.05604, over 24329.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2744, pruned_loss=0.06838, over 4708230.89 frames. ], batch size: 61, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:05:19,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:19,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:05:20,644 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.918e+02 2.148e+02 2.537e+02 4.144e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-29 11:05:20,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:22,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:05:22,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:22,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:23,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:23,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 11:05:23,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:29,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:29,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:30,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:05:32,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:34,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:34,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:35,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:05:35,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=344786.6666666667, ans=0.2 2023-09-29 11:05:36,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 11:05:36,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:05:39,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 11:05:39,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:39,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 11:05:42,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 11:05:42,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.66 vs. limit=10.0 2023-09-29 11:05:46,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:48,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:48,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:05:48,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:05:52,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:55,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:05:56,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:05:56,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:05:58,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:06:03,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:03,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:06:11,337 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.16 vs. limit=15.0 2023-09-29 11:06:12,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:06:14,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:06:24,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:06:28,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:28,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=344986.6666666667, ans=0.1 2023-09-29 11:06:30,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 11:06:30,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 11:06:30,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:33,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 11:06:34,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:06:34,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 11:06:42,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:42,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 11:06:43,927 INFO [train.py:1039] (2/4) Epoch 10, batch 3950, loss[loss=0.2111, simple_loss=0.2762, pruned_loss=0.07298, over 23533.00 frames. ], tot_loss[loss=0.205, simple_loss=0.274, pruned_loss=0.06797, over 4701440.94 frames. ], batch size: 134, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:06:44,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:06:46,174 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.33 vs. limit=6.0 2023-09-29 11:06:47,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:06:48,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:06:58,410 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 11:06:59,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:06:59,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 11:07:00,590 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 11:07:00,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:03,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:03,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:07:03,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:07,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 11:07:08,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:07:10,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:07:10,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:07:10,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:07:10,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:07:22,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:07:22,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:07:29,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 11:07:29,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=345186.6666666667, ans=0.1 2023-09-29 11:07:31,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=12.0 2023-09-29 11:07:34,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 11:07:34,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 11:07:36,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:07:37,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:07:39,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=345253.3333333333, ans=0.125 2023-09-29 11:07:42,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=345253.3333333333, ans=0.125 2023-09-29 11:07:45,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=345253.3333333333, ans=0.0 2023-09-29 11:07:45,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=345253.3333333333, ans=0.1 2023-09-29 11:07:47,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:07:47,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:07:47,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:47,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:07:47,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 11:07:50,859 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.76 vs. limit=12.0 2023-09-29 11:07:56,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:07:56,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=345320.0, ans=0.125 2023-09-29 11:07:57,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:07:59,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 11:08:02,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.76 vs. limit=10.0 2023-09-29 11:08:07,683 INFO [train.py:1039] (2/4) Epoch 10, batch 4000, loss[loss=0.2213, simple_loss=0.2823, pruned_loss=0.08015, over 23825.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2744, pruned_loss=0.06806, over 4711433.17 frames. ], batch size: 164, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 11:08:09,120 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.046e+02 2.407e+02 2.777e+02 6.014e+02, threshold=4.814e+02, percent-clipped=1.0 2023-09-29 11:08:10,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:17,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:22,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:23,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:08:23,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:23,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 11:08:25,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:08:26,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 11:08:26,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:08:26,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 11:08:31,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:34,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:08:34,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:08:34,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:08:34,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:34,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:08:37,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:08:39,359 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 11:08:40,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:08:41,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:08:43,971 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 11:08:44,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:08:44,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:08:52,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 11:08:52,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=345520.0, ans=0.0 2023-09-29 11:08:53,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:55,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:08:55,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=345586.6666666667, ans=0.125 2023-09-29 11:08:56,864 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 11:08:58,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:08:58,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 11:08:58,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:09:00,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:01,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:09:03,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:09:03,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:09:04,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=345586.6666666667, ans=0.2 2023-09-29 11:09:05,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:09:08,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 11:09:08,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:10,480 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 11:09:14,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=345653.3333333333, ans=0.125 2023-09-29 11:09:16,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:09:18,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:09:20,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:09:21,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:23,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:09:24,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:28,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=345720.0, ans=0.125 2023-09-29 11:09:29,484 INFO [train.py:1039] (2/4) Epoch 10, batch 4050, loss[loss=0.2145, simple_loss=0.2854, pruned_loss=0.07179, over 23935.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2754, pruned_loss=0.06801, over 4724030.51 frames. ], batch size: 86, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:09:29,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:09:34,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 11:09:37,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:09:37,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:09:38,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:09:38,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:09:40,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:40,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=345720.0, ans=0.0 2023-09-29 11:09:44,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:49,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:09:49,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:09:50,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:09:50,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:09:55,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:57,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:09:57,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=345786.6666666667, ans=0.125 2023-09-29 11:10:00,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 11:10:02,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 11:10:03,760 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 11:10:05,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:10:10,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=345853.3333333333, ans=0.125 2023-09-29 11:10:11,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 11:10:11,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:16,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:20,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:10:22,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:10:22,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:25,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:10:25,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=345920.0, ans=0.125 2023-09-29 11:10:28,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 11:10:28,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:10:29,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:32,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 11:10:36,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:44,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 11:10:45,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:45,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:10:47,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 11:10:47,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 11:10:47,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:10:50,225 INFO [train.py:1039] (2/4) Epoch 10, batch 4100, loss[loss=0.2045, simple_loss=0.288, pruned_loss=0.06044, over 24478.00 frames. ], tot_loss[loss=0.2063, simple_loss=0.2757, pruned_loss=0.06847, over 4726800.09 frames. ], batch size: 69, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:10:50,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:10:52,478 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.996e+02 2.168e+02 2.450e+02 3.987e+02, threshold=4.335e+02, percent-clipped=0.0 2023-09-29 11:10:52,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:10:52,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:10:57,539 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.98 vs. limit=10.0 2023-09-29 11:11:01,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 11:11:04,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 11:11:06,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 11:11:08,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 11:11:08,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:09,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:11:11,335 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 11:11:14,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:14,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:11:15,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:17,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:11:17,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=346120.0, ans=0.1 2023-09-29 11:11:20,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:11:21,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:23,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:11:23,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 11:11:23,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:23,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:11:23,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:23,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:11:24,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 11:11:26,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:28,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 11:11:29,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=346186.6666666667, ans=0.125 2023-09-29 11:11:30,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:11:32,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:32,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 11:11:34,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=346186.6666666667, ans=0.125 2023-09-29 11:11:35,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:11:35,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:11:37,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:11:38,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 11:11:40,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:11:41,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:11:41,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=346253.3333333333, ans=0.125 2023-09-29 11:11:44,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 11:11:44,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:45,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:11:48,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:53,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:11:56,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:11:58,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:12:06,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:06,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:12:09,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:12:09,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=346320.0, ans=0.1 2023-09-29 11:12:12,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:12:13,807 INFO [train.py:1039] (2/4) Epoch 10, batch 4150, loss[loss=0.1887, simple_loss=0.2718, pruned_loss=0.05287, over 24478.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2762, pruned_loss=0.06865, over 4722809.73 frames. ], batch size: 66, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:12:17,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:12:19,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:12:19,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:12:19,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:23,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 11:12:23,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:23,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 11:12:23,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 11:12:24,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=346386.6666666667, ans=0.125 2023-09-29 11:12:24,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.37 vs. limit=10.0 2023-09-29 11:12:25,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 11:12:26,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:31,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:12:31,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:36,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:12:38,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:12:38,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:12:40,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:12:40,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:40,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=346453.3333333333, ans=0.95 2023-09-29 11:12:42,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:12:42,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=346453.3333333333, ans=0.0 2023-09-29 11:12:45,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=346520.0, ans=0.125 2023-09-29 11:12:46,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:52,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:12:52,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 11:12:54,404 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.40 vs. limit=22.5 2023-09-29 11:12:56,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 11:12:56,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:12:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 11:12:56,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:12:56,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:13:01,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:01,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:04,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 11:13:06,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:08,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:10,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 11:13:11,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:13:13,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 11:13:15,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:13:18,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:13:18,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:19,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 11:13:19,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:19,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:13:21,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:13:28,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 11:13:29,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:29,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:13:29,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:13:30,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 11:13:31,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:31,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:13:31,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:13:33,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:33,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 11:13:33,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=346653.3333333333, ans=0.95 2023-09-29 11:13:34,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.89 vs. limit=15.0 2023-09-29 11:13:35,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:39,464 INFO [train.py:1039] (2/4) Epoch 10, batch 4200, loss[loss=0.201, simple_loss=0.2719, pruned_loss=0.06507, over 20985.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.275, pruned_loss=0.0683, over 4723742.70 frames. ], batch size: 45, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:13:39,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:13:41,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 11:13:42,748 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.147e+02 2.478e+02 3.007e+02 3.865e+02, threshold=4.955e+02, percent-clipped=0.0 2023-09-29 11:13:42,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:13:46,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:13:48,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:13:48,795 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.14 vs. limit=15.0 2023-09-29 11:13:49,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:49,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:49,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=346720.0, ans=0.125 2023-09-29 11:13:51,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 11:13:51,951 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:13:53,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=346720.0, ans=0.2 2023-09-29 11:13:54,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 11:13:54,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:56,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:59,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:14:03,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:14:04,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=346786.6666666667, ans=0.125 2023-09-29 11:14:04,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=346786.6666666667, ans=0.1 2023-09-29 11:14:04,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=346786.6666666667, ans=0.04949747468305833 2023-09-29 11:14:06,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:06,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:07,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 11:14:07,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:14:07,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=346786.6666666667, ans=0.125 2023-09-29 11:14:09,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:10,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:14:10,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:14:12,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:14:15,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 11:14:15,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:20,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:14:20,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:14:22,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:14:23,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:14:25,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:14:25,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 11:14:27,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:27,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:14:32,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:14:35,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:40,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=346920.0, ans=0.1 2023-09-29 11:14:42,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:14:44,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 11:14:47,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:53,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:14:54,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:14:55,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 11:14:58,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=346986.6666666667, ans=0.0 2023-09-29 11:15:01,580 INFO [train.py:1039] (2/4) Epoch 10, batch 4250, loss[loss=0.1787, simple_loss=0.2497, pruned_loss=0.0539, over 22423.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2727, pruned_loss=0.06727, over 4727984.08 frames. ], batch size: 49, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:15:03,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:15:06,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:15:07,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:15:08,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=347053.3333333333, ans=0.0 2023-09-29 11:15:10,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:14,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:15:14,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 11:15:14,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:15:17,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:22,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:26,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:28,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:29,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:15:29,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:15:31,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:33,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:35,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:38,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:15:39,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:15:41,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 11:15:43,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 11:15:43,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:45,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:45,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:46,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:15:46,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:48,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:51,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:15:51,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:15:56,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:15:58,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:15:59,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 11:15:59,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:16:01,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 11:16:03,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:16:05,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:16:06,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:06,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:16:10,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 11:16:11,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:16:11,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:16:15,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:18,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:21,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:16:22,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:16:24,807 INFO [train.py:1039] (2/4) Epoch 10, batch 4300, loss[loss=0.1988, simple_loss=0.2846, pruned_loss=0.05647, over 24326.00 frames. ], tot_loss[loss=0.203, simple_loss=0.272, pruned_loss=0.06706, over 4725078.27 frames. ], batch size: 74, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:16:24,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:26,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:16:27,862 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.974e+02 2.213e+02 2.611e+02 3.799e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 11:16:27,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:16:27,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 11:16:29,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:32,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:34,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:16:38,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:39,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.71 vs. limit=10.0 2023-09-29 11:16:42,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=347453.3333333333, ans=0.125 2023-09-29 11:16:44,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=347453.3333333333, ans=0.125 2023-09-29 11:16:45,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:45,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 11:16:48,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:16:49,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:16:50,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:16:50,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 11:16:52,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=347453.3333333333, ans=0.125 2023-09-29 11:16:54,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:16:56,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:16:59,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 11:16:59,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:16:59,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 11:17:02,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:17:04,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:17:07,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:17:07,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:17:09,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:17:11,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:11,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:17:13,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 11:17:13,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 11:17:15,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:17:18,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:17:18,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:18,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 11:17:18,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 11:17:20,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 11:17:20,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:20,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 11:17:20,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=347586.6666666667, ans=0.0 2023-09-29 11:17:21,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 11:17:25,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:25,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=347586.6666666667, ans=0.0 2023-09-29 11:17:27,096 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 11:17:28,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:17:28,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:28,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:31,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 11:17:33,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:17:33,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:33,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:17:33,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:33,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:17:35,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:17:38,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:40,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:40,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:47,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 11:17:48,416 INFO [train.py:1039] (2/4) Epoch 10, batch 4350, loss[loss=0.2406, simple_loss=0.3137, pruned_loss=0.08373, over 24038.00 frames. ], tot_loss[loss=0.2039, simple_loss=0.2728, pruned_loss=0.06746, over 4721309.97 frames. ], batch size: 80, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:17:48,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:17:53,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:57,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:59,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:17:59,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:17:59,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=347720.0, ans=0.2 2023-09-29 11:18:02,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=347720.0, ans=0.1 2023-09-29 11:18:05,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:18:05,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=347786.6666666667, ans=0.1 2023-09-29 11:18:08,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:18:08,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=347786.6666666667, ans=0.125 2023-09-29 11:18:11,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:18:11,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:16,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:18:19,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:18:21,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:18:27,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 11:18:28,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:28,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:35,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:36,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 11:18:38,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=347920.0, ans=0.1 2023-09-29 11:18:39,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:41,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:18:45,970 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 11:18:46,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:47,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:18:47,667 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 11:18:49,076 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 11:18:49,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:50,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:50,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:18:50,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:52,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:52,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:55,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 11:18:55,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:55,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:56,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:56,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 11:18:57,972 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 11:18:59,918 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 11:18:59,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 11:19:03,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:19:04,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:19:04,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:06,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:19:06,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=347986.6666666667, ans=0.125 2023-09-29 11:19:08,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 11:19:09,535 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 11:19:09,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:10,992 INFO [train.py:1039] (2/4) Epoch 10, batch 4400, loss[loss=0.2048, simple_loss=0.2847, pruned_loss=0.06243, over 24453.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2736, pruned_loss=0.06758, over 4728760.58 frames. ], batch size: 77, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:19:14,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.070e+02 2.289e+02 2.714e+02 5.548e+02, threshold=4.577e+02, percent-clipped=2.0 2023-09-29 11:19:14,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:14,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:14,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=348053.3333333333, ans=0.0 2023-09-29 11:19:17,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:19:18,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 11:19:18,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 11:19:20,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 11:19:20,229 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 11:19:21,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:19:21,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:24,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 11:19:26,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:28,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:28,841 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 11:19:29,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=348120.0, ans=0.2 2023-09-29 11:19:30,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:30,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 11:19:30,603 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 11:19:35,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 11:19:36,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 11:19:37,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 11:19:37,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:38,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:38,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:40,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:19:42,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 11:19:42,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 11:19:43,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:46,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:19:46,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:48,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:48,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:48,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 11:19:48,513 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 11:19:51,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:58,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:20:01,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 11:20:04,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:20:06,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:10,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:20:10,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 11:20:10,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:20:10,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:20:10,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:20:11,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:20:15,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=348320.0, ans=0.2 2023-09-29 11:20:16,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 11:20:18,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 11:20:18,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=348320.0, ans=0.0 2023-09-29 11:20:19,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 11:20:19,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:19,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 11:20:20,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=348320.0, ans=0.0 2023-09-29 11:20:21,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:20:25,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:20:28,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 11:20:32,540 INFO [train.py:1039] (2/4) Epoch 10, batch 4450, loss[loss=0.1745, simple_loss=0.2502, pruned_loss=0.04944, over 24611.00 frames. ], tot_loss[loss=0.204, simple_loss=0.2734, pruned_loss=0.06729, over 4733438.09 frames. ], batch size: 60, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:20:32,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:35,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:35,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=348386.6666666667, ans=0.125 2023-09-29 11:20:36,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:20:36,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=348386.6666666667, ans=0.09899494936611666 2023-09-29 11:20:44,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:20:44,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:20:49,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:52,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:20:55,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:20:55,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:56,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 11:20:56,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=348453.3333333333, ans=0.125 2023-09-29 11:20:57,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:20:59,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:59,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:20:59,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:21:00,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=348453.3333333333, ans=0.125 2023-09-29 11:21:02,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:21:06,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:06,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:09,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:21:09,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:21:10,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:21:13,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=348520.0, ans=0.1 2023-09-29 11:21:15,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:21:16,717 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.32 vs. limit=15.0 2023-09-29 11:21:17,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 11:21:17,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 11:21:17,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:21:17,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=348520.0, ans=0.1 2023-09-29 11:21:19,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=348520.0, ans=0.125 2023-09-29 11:21:22,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:23,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 11:21:26,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:21:32,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:33,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 11:21:33,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:33,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:33,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:21:33,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:35,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:38,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:21:39,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 11:21:41,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:21:43,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:21:43,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:47,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:47,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:21:48,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:21:51,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 11:21:53,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:21:54,838 INFO [train.py:1039] (2/4) Epoch 10, batch 4500, loss[loss=0.1996, simple_loss=0.2755, pruned_loss=0.06188, over 24514.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2746, pruned_loss=0.06735, over 4736453.03 frames. ], batch size: 66, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:21:58,628 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.956e+02 2.459e+02 2.945e+02 4.663e+02, threshold=4.917e+02, percent-clipped=1.0 2023-09-29 11:21:58,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:21:59,816 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.83 vs. limit=15.0 2023-09-29 11:22:00,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 11:22:00,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 11:22:01,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:02,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=348720.0, ans=0.125 2023-09-29 11:22:08,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:22:08,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:22:09,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:22:10,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:22:10,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:11,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:22,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:23,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:22:25,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:27,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:22:28,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:22:36,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:22:40,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:22:45,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:22:45,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=348920.0, ans=0.125 2023-09-29 11:22:47,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=348920.0, ans=0.125 2023-09-29 11:22:49,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:22:50,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 11:22:51,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.46 vs. limit=22.5 2023-09-29 11:22:52,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:22:52,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:53,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:55,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:56,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 11:22:56,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:22:56,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:22:59,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=348920.0, ans=0.0 2023-09-29 11:23:03,008 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.70 vs. limit=15.0 2023-09-29 11:23:03,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:23:03,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:23:07,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:10,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:23:10,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:23:12,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 11:23:12,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 11:23:12,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 11:23:17,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 11:23:18,492 INFO [train.py:1039] (2/4) Epoch 10, batch 4550, loss[loss=0.1756, simple_loss=0.2524, pruned_loss=0.04938, over 24345.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.2731, pruned_loss=0.06752, over 4724356.48 frames. ], batch size: 56, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:23:20,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 11:23:20,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:23,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:25,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:25,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=349053.3333333333, ans=0.1 2023-09-29 11:23:28,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:32,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:23:35,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:23:38,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:23:38,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:23:38,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:41,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:43,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:45,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:23:46,500 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.29 vs. limit=15.0 2023-09-29 11:23:49,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 11:23:50,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 11:23:52,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:23:53,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 11:23:56,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 11:23:59,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:02,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 11:24:05,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:24:07,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.40 vs. limit=15.0 2023-09-29 11:24:08,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:24:10,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 11:24:13,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:15,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:15,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:16,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:17,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 11:24:17,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 11:24:19,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:24:19,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 11:24:20,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 11:24:22,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:23,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:23,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:25,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:25,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:24:26,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:24:28,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 11:24:30,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:30,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:24:32,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 11:24:32,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:24:32,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 11:24:32,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=349320.0, ans=0.125 2023-09-29 11:24:35,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:24:35,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:24:37,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:24:38,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:38,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:24:39,406 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.49 vs. limit=15.0 2023-09-29 11:24:40,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=349386.6666666667, ans=0.0 2023-09-29 11:24:41,256 INFO [train.py:1039] (2/4) Epoch 10, batch 4600, loss[loss=0.2164, simple_loss=0.2889, pruned_loss=0.07192, over 23770.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.272, pruned_loss=0.06728, over 4721825.86 frames. ], batch size: 85, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:24:41,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:24:44,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:24:45,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=349386.6666666667, ans=0.0 2023-09-29 11:24:46,324 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.915e+02 2.143e+02 2.405e+02 4.065e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-29 11:24:46,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=349386.6666666667, ans=0.1 2023-09-29 11:24:47,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:48,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:51,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:24:51,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:24:51,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:24:53,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 11:24:55,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:24:57,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.80 vs. limit=15.0 2023-09-29 11:24:59,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:24:59,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:01,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:04,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=349453.3333333333, ans=0.125 2023-09-29 11:25:09,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 11:25:12,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:15,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:16,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=349520.0, ans=0.0 2023-09-29 11:25:17,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:25:17,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:24,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 11:25:24,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:25:24,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:25:29,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:29,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:25:31,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:25:33,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.18 vs. limit=15.0 2023-09-29 11:25:37,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 11:25:38,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:25:42,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:45,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:25:48,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:48,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 11:25:48,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:48,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 11:25:50,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:50,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:51,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:51,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:53,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:53,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 11:25:55,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 11:25:55,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 11:25:55,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:55,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=349653.3333333333, ans=0.07 2023-09-29 11:25:56,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:25:59,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:59,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:26:03,492 INFO [train.py:1039] (2/4) Epoch 10, batch 4650, loss[loss=0.227, simple_loss=0.2985, pruned_loss=0.07773, over 23478.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2716, pruned_loss=0.06756, over 4715248.77 frames. ], batch size: 93, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:26:09,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:26:12,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:12,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:12,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:26:14,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:26:14,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:14,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:18,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 11:26:21,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:26:22,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 11:26:24,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:24,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 11:26:24,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:26:25,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 11:26:25,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 11:26:25,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:26,027 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:26:27,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:26:31,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:26:32,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:32,996 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 11:26:35,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:35,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=349853.3333333333, ans=0.0 2023-09-29 11:26:36,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 11:26:39,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:40,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:26:41,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 11:26:42,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:26:45,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:26:51,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=349920.0, ans=0.1 2023-09-29 11:26:52,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:55,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:58,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:00,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:02,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:27:02,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 11:27:03,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 11:27:03,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 11:27:03,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 11:27:06,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:12,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:27:12,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:14,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 11:27:14,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:16,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:16,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:27:17,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:27:18,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=349986.6666666667, ans=0.125 2023-09-29 11:27:20,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:27:20,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:22,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:25,778 INFO [train.py:1039] (2/4) Epoch 10, batch 4700, loss[loss=0.2106, simple_loss=0.2816, pruned_loss=0.06981, over 23703.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2722, pruned_loss=0.06732, over 4713031.44 frames. ], batch size: 149, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:27:25,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:26,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:27:26,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:27:27,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 11:27:27,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:27:29,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 11:27:30,693 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.991e+02 2.178e+02 2.659e+02 4.780e+02, threshold=4.356e+02, percent-clipped=1.0 2023-09-29 11:27:36,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=350053.3333333333, ans=0.125 2023-09-29 11:27:39,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:39,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=350053.3333333333, ans=0.2 2023-09-29 11:27:40,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:40,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:27:43,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:44,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:27:48,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=350120.0, ans=0.0 2023-09-29 11:27:49,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 11:27:49,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 11:27:52,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:54,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:27:54,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:58,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:02,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=350186.6666666667, ans=0.2 2023-09-29 11:28:06,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:28:08,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:28:10,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:28:16,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=350253.3333333333, ans=0.125 2023-09-29 11:28:17,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 11:28:18,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:28:20,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:24,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 11:28:27,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:28:32,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:28:33,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 11:28:33,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:33,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:36,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:36,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:28:36,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 11:28:38,934 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 11:28:39,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:40,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 11:28:42,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:47,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 11:28:49,339 INFO [train.py:1039] (2/4) Epoch 10, batch 4750, loss[loss=0.2044, simple_loss=0.281, pruned_loss=0.0639, over 24011.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2744, pruned_loss=0.06856, over 4704098.04 frames. ], batch size: 80, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:28:51,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:28:52,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:28:59,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 11:28:59,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:03,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 11:29:04,385 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:29:05,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:29:05,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:29:05,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:12,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 11:29:18,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:29:19,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 11:29:19,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:22,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:22,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:24,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:25,122 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 11:29:25,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 11:29:30,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 11:29:32,534 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.55 vs. limit=12.0 2023-09-29 11:29:34,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:36,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:29:39,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:29:39,147 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 11:29:39,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:29:42,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:29:45,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:29:47,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 11:29:47,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 11:29:47,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:48,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:29:48,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:50,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:29:50,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 11:29:55,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 11:29:57,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:29:59,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=350653.3333333333, ans=0.2 2023-09-29 11:30:01,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:30:01,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 11:30:01,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=350653.3333333333, ans=0.1 2023-09-29 11:30:02,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:04,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:04,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:30:06,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:06,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:30:09,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:09,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 11:30:09,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 11:30:11,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 11:30:12,440 INFO [train.py:1039] (2/4) Epoch 10, batch 4800, loss[loss=0.2192, simple_loss=0.2821, pruned_loss=0.07813, over 23811.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.2747, pruned_loss=0.06812, over 4706014.12 frames. ], batch size: 179, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:30:15,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:30:15,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:16,717 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.911e+02 2.173e+02 2.490e+02 3.366e+02, threshold=4.345e+02, percent-clipped=0.0 2023-09-29 11:30:16,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 11:30:22,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:23,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:28,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:30:28,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:30,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:30,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 11:30:31,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:31,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:30:35,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:30:39,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:30:40,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:42,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:30:43,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:44,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:30:44,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:45,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:48,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=350853.3333333333, ans=0.125 2023-09-29 11:30:49,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:50,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=350853.3333333333, ans=0.2 2023-09-29 11:30:51,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:53,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:53,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:30:55,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:30:56,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:58,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 11:30:58,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 11:30:59,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:59,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:30:59,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:31:00,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:00,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.35 vs. limit=10.0 2023-09-29 11:31:02,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:31:03,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=350920.0, ans=0.125 2023-09-29 11:31:05,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:31:05,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:08,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=350920.0, ans=0.125 2023-09-29 11:31:09,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:12,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:13,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:18,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 11:31:20,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:21,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:21,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:31:23,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:26,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:28,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:31:28,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:28,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:31:29,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:31:30,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:31:33,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:33,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:33,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:34,447 INFO [train.py:1039] (2/4) Epoch 10, batch 4850, loss[loss=0.2154, simple_loss=0.2671, pruned_loss=0.08184, over 23753.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2751, pruned_loss=0.06809, over 4711060.17 frames. ], batch size: 232, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:31:34,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 11:31:37,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 11:31:37,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:31:37,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:41,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:51,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 11:31:51,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:56,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:31:57,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:31:57,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:32:01,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:32:02,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:32:04,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:32:04,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 11:32:07,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:32:09,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=351186.6666666667, ans=0.0 2023-09-29 11:32:10,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:32:11,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:32:11,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:32:11,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 11:32:14,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:32:15,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:18,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:18,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 11:32:21,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 11:32:22,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:32:26,499 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:32:28,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=351253.3333333333, ans=0.1 2023-09-29 11:32:29,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:32:30,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 11:32:32,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:32:32,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:32:34,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:32:36,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 11:32:36,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:36,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 11:32:37,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:32:37,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:32:39,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 11:32:49,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:52,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=351320.0, ans=0.125 2023-09-29 11:32:54,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:32:54,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:32:57,374 INFO [train.py:1039] (2/4) Epoch 10, batch 4900, loss[loss=0.1704, simple_loss=0.2423, pruned_loss=0.04924, over 24323.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.2737, pruned_loss=0.06721, over 4712271.99 frames. ], batch size: 56, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:33:00,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 11:33:00,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:33:02,130 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 2.045e+02 2.293e+02 2.550e+02 3.770e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 11:33:06,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:08,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:08,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:33:12,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 11:33:17,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 11:33:19,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=351453.3333333333, ans=0.0 2023-09-29 11:33:20,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=351453.3333333333, ans=0.125 2023-09-29 11:33:22,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 11:33:23,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 11:33:23,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:23,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:23,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=351453.3333333333, ans=0.2 2023-09-29 11:33:25,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:33:25,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:25,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:33:25,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 11:33:30,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 11:33:30,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:33:32,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:33:34,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:35,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:33:35,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:37,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:37,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 11:33:38,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:33:42,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:42,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 11:33:42,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 11:33:45,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 11:33:47,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:33:50,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:33:50,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:33:52,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:52,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:33:52,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:33:52,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 11:33:54,452 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.46 vs. limit=15.0 2023-09-29 11:33:55,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:56,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:33:58,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:34:01,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 11:34:03,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:34:03,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 11:34:04,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 11:34:09,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=351653.3333333333, ans=0.1 2023-09-29 11:34:11,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:13,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:14,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 11:34:15,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:15,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:34:17,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:20,041 INFO [train.py:1039] (2/4) Epoch 10, batch 4950, loss[loss=0.2107, simple_loss=0.2842, pruned_loss=0.0686, over 23919.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2731, pruned_loss=0.06679, over 4719589.92 frames. ], batch size: 80, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:34:20,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:20,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:34:22,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:22,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:34:23,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:34:25,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:25,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:28,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 11:34:30,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 11:34:30,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:34:31,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 11:34:31,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:31,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:34:31,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:34:32,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:34,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:36,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:34:37,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:34:38,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:39,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:41,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:44,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:34:48,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:49,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:52,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:52,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:54,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:34:55,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.03 vs. limit=10.0 2023-09-29 11:34:56,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 11:34:57,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 11:34:59,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:00,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:35:00,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:35:03,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:03,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:35:05,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:35:07,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=351920.0, ans=0.125 2023-09-29 11:35:08,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:10,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:35:11,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:35:13,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:13,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:15,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 11:35:15,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:35:17,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:35:21,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:35:23,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:35:23,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:35:23,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:24,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:35:26,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:35:28,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:35:28,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:35:30,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:31,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 11:35:35,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=351986.6666666667, ans=15.0 2023-09-29 11:35:36,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:35:41,007 INFO [train.py:1039] (2/4) Epoch 10, batch 5000, loss[loss=0.1857, simple_loss=0.2519, pruned_loss=0.05981, over 24466.00 frames. ], tot_loss[loss=0.2025, simple_loss=0.272, pruned_loss=0.06645, over 4703005.85 frames. ], batch size: 58, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:35:41,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 11:35:41,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:35:46,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:47,803 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.017e+02 2.302e+02 2.737e+02 4.823e+02, threshold=4.603e+02, percent-clipped=1.0 2023-09-29 11:35:47,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:35:48,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=352053.3333333333, ans=0.125 2023-09-29 11:35:49,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 11:35:51,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 11:35:53,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:35:54,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 11:35:54,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:54,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:35:56,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 11:35:57,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:59,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:01,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 11:36:01,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:01,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:02,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 11:36:02,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 11:36:03,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:36:03,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 11:36:03,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:36:03,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:04,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:36:04,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 11:36:04,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 11:36:06,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 11:36:06,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:07,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:09,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 11:36:09,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:36:11,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:12,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:12,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:36:14,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 11:36:15,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:36:16,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:36:19,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=352186.6666666667, ans=0.2 2023-09-29 11:36:21,610 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 11:36:24,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:26,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:26,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:32,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 11:36:32,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:32,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:32,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:36:34,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 11:36:35,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:37,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:38,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:36:43,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=352253.3333333333, ans=0.1 2023-09-29 11:36:44,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 11:36:49,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:58,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:37:00,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:00,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:37:00,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:01,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:37:01,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:37:03,168 INFO [train.py:1039] (2/4) Epoch 10, batch 5050, loss[loss=0.1923, simple_loss=0.2766, pruned_loss=0.05404, over 24646.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.273, pruned_loss=0.06664, over 4705035.88 frames. ], batch size: 68, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:37:03,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:08,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:08,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 11:37:10,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:37:12,662 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.34 vs. limit=12.0 2023-09-29 11:37:13,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:16,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:37:16,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 11:37:17,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:17,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:37:20,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:37:22,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:37:22,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:37:27,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=352453.3333333333, ans=10.0 2023-09-29 11:37:34,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 11:37:36,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:37:36,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:37:36,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 11:37:36,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:37:38,943 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.58 vs. limit=22.5 2023-09-29 11:37:39,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:39,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:41,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:37:41,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 11:37:42,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 11:37:44,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:46,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:37:49,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:51,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 11:37:52,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:37:55,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 11:37:57,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:37:57,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:37:58,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:37:58,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:38:00,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:02,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:38:03,061 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.88 vs. limit=22.5 2023-09-29 11:38:03,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:03,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:38:04,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:38:04,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=352586.6666666667, ans=0.0 2023-09-29 11:38:05,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 11:38:07,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:38:09,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:38:12,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:38:12,592 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 11:38:12,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:38:14,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:14,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:14,250 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 11:38:17,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:17,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 11:38:17,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:21,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:23,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:23,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 11:38:24,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 11:38:26,307 INFO [train.py:1039] (2/4) Epoch 10, batch 5100, loss[loss=0.1971, simple_loss=0.284, pruned_loss=0.05509, over 24352.00 frames. ], tot_loss[loss=0.2042, simple_loss=0.2739, pruned_loss=0.06723, over 4692346.04 frames. ], batch size: 74, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:38:26,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:27,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:38:28,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:38:31,046 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 11:38:32,351 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.939e+02 2.293e+02 2.682e+02 4.893e+02, threshold=4.586e+02, percent-clipped=1.0 2023-09-29 11:38:34,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:37,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 11:38:39,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 11:38:41,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:42,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:44,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:44,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 11:38:44,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 11:38:46,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=352786.6666666667, ans=0.125 2023-09-29 11:38:50,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:50,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:38:53,340 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.96 vs. limit=12.0 2023-09-29 11:38:57,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:59,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 11:38:59,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:01,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:39:01,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=352853.3333333333, ans=0.1 2023-09-29 11:39:02,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:39:05,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 11:39:07,659 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 11:39:09,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:09,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 11:39:09,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 11:39:09,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=352853.3333333333, ans=0.125 2023-09-29 11:39:15,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:15,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=352920.0, ans=0.125 2023-09-29 11:39:21,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:22,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 11:39:23,003 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 11:39:23,018 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 11:39:24,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 11:39:24,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:29,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 11:39:33,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 11:39:36,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=352986.6666666667, ans=0.125 2023-09-29 11:39:37,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:39:38,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=352986.6666666667, ans=0.1 2023-09-29 11:39:39,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:39:41,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 11:39:44,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:39:44,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 11:39:44,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=352986.6666666667, ans=0.0 2023-09-29 11:39:49,545 INFO [train.py:1039] (2/4) Epoch 10, batch 5150, loss[loss=0.1916, simple_loss=0.2747, pruned_loss=0.05429, over 24616.00 frames. ], tot_loss[loss=0.2063, simple_loss=0.276, pruned_loss=0.06833, over 4701359.38 frames. ], batch size: 68, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:39:49,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:39:50,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:39:50,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:39:51,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:39:51,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:39:51,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:39:52,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 11:39:52,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 11:39:54,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 11:39:54,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:39:54,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 11:39:55,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:55,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:39:57,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:39:58,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:05,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:40:06,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 11:40:07,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:07,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:40:07,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:40:07,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:09,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:09,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:40:09,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:40:10,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 11:40:11,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:40:12,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:12,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:40:12,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=353120.0, ans=0.125 2023-09-29 11:40:15,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 11:40:15,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=353120.0, ans=0.07 2023-09-29 11:40:17,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:40:24,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:40:24,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 11:40:28,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:31,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.91 vs. limit=22.5 2023-09-29 11:40:32,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=353186.6666666667, ans=0.025 2023-09-29 11:40:34,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:34,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=353186.6666666667, ans=0.125 2023-09-29 11:40:34,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=353186.6666666667, ans=0.125 2023-09-29 11:40:35,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:39,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:41,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:43,588 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.59 vs. limit=15.0 2023-09-29 11:40:44,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 11:40:49,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:51,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:40:51,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:54,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:55,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:57,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 11:41:00,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:01,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=353320.0, ans=0.0 2023-09-29 11:41:02,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:41:04,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:41:04,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:41:05,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:41:05,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:41:05,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:41:05,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:41:09,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:41:11,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:41:13,142 INFO [train.py:1039] (2/4) Epoch 10, batch 5200, loss[loss=0.1967, simple_loss=0.2738, pruned_loss=0.05983, over 23349.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2769, pruned_loss=0.06933, over 4695278.92 frames. ], batch size: 93, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:41:14,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:19,173 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.032e+02 2.395e+02 2.917e+02 4.034e+02, threshold=4.790e+02, percent-clipped=0.0 2023-09-29 11:41:19,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 11:41:19,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:41:20,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:22,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.26 vs. limit=15.0 2023-09-29 11:41:24,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:26,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:41:26,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:27,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 11:41:30,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:41:31,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.09 vs. limit=15.0 2023-09-29 11:41:32,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:35,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 11:41:36,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=353453.3333333333, ans=0.125 2023-09-29 11:41:38,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:41:39,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:41:40,237 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=353453.3333333333, ans=0.125 2023-09-29 11:41:41,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 11:41:41,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 11:41:42,340 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.25 vs. limit=15.0 2023-09-29 11:41:44,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 11:41:44,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=353520.0, ans=0.1 2023-09-29 11:41:46,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:46,369 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 11:41:46,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:47,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:48,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:41:48,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 11:41:48,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=353520.0, ans=0.125 2023-09-29 11:41:49,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:41:49,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=353520.0, ans=0.0 2023-09-29 11:41:53,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:56,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 11:41:56,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 11:41:56,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 11:42:03,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 11:42:04,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:42:07,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=353586.6666666667, ans=0.0 2023-09-29 11:42:09,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:42:09,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:10,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 11:42:10,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:42:12,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 11:42:12,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:12,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:15,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:16,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.17 vs. limit=15.0 2023-09-29 11:42:17,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:42:20,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:42:22,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:22,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:25,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.23 vs. limit=22.5 2023-09-29 11:42:28,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:30,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 11:42:32,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:32,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:42:34,452 INFO [train.py:1039] (2/4) Epoch 10, batch 5250, loss[loss=0.2185, simple_loss=0.2937, pruned_loss=0.07167, over 24441.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2754, pruned_loss=0.06851, over 4700413.97 frames. ], batch size: 69, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:42:34,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:36,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:42:36,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:42:36,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=353720.0, ans=0.1 2023-09-29 11:42:40,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:42:43,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:45,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:42:45,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:42:45,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=353720.0, ans=0.1 2023-09-29 11:42:50,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:51,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:42:52,469 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.53 vs. limit=15.0 2023-09-29 11:42:52,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.63 vs. limit=22.5 2023-09-29 11:42:56,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:42:58,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:58,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 11:42:58,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:59,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:43:07,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=353853.3333333333, ans=0.125 2023-09-29 11:43:21,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=353920.0, ans=0.125 2023-09-29 11:43:34,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=353986.6666666667, ans=0.125 2023-09-29 11:43:43,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=353986.6666666667, ans=0.125 2023-09-29 11:43:48,679 INFO [train.py:1039] (2/4) Epoch 10, batch 5300, loss[loss=0.2235, simple_loss=0.2794, pruned_loss=0.08384, over 23807.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2739, pruned_loss=0.06818, over 4680513.43 frames. ], batch size: 212, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:43:54,366 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.989e+02 2.153e+02 2.436e+02 4.114e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 11:44:05,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:44:05,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 11:44:05,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 11:44:05,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:05,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:05,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:06,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:06,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:06,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:06,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:06,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:44:07,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:44:07,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 11:44:07,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 11:44:07,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 11:44:07,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:44:07,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 11:44:07,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 11:44:07,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:08,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:08,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:08,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:08,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:44:09,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:09,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:09,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:09,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:09,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:09,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:44:09,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:09,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:44:10,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 11:44:10,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:11,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:11,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 11:44:11,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 11:44:11,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:44:11,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:11,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 11:44:11,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 11:44:12,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:12,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:44:12,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:13,043 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 11:44:13,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 11:44:13,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:44:13,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:13,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 11:44:14,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 11:44:14,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 11:44:14,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:22,147 INFO [train.py:1039] (2/4) Epoch 11, batch 0, loss[loss=0.2109, simple_loss=0.2863, pruned_loss=0.06777, over 23468.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2863, pruned_loss=0.06777, over 23468.00 frames. ], batch size: 93, lr: 9.67e-03, grad_scale: 32.0 2023-09-29 11:44:22,148 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 11:44:36,233 INFO [train.py:1071] (2/4) Epoch 11, validation: loss=0.3103, simple_loss=0.2886, pruned_loss=0.166, over 1125622.00 frames. 2023-09-29 11:44:36,233 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 11:44:38,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 11:44:38,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:44:42,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:44:47,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.07 vs. limit=12.0 2023-09-29 11:44:48,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:48,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:44:48,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:48,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 11:44:50,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 11:44:53,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:54,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:57,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:57,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:59,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:44:59,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:00,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 11:45:02,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:08,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=354273.3333333333, ans=0.125 2023-09-29 11:45:11,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:45:11,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:13,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 11:45:18,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:45:18,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:45:20,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:26,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:45:33,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:35,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=354340.0, ans=0.0 2023-09-29 11:45:36,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 11:45:39,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 11:45:41,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:45:41,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:42,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:45:44,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:45,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 11:45:49,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:50,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:53,612 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.14 vs. limit=15.0 2023-09-29 11:45:54,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:45:57,370 INFO [train.py:1039] (2/4) Epoch 11, batch 50, loss[loss=0.1775, simple_loss=0.2586, pruned_loss=0.04822, over 24555.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2734, pruned_loss=0.06562, over 1068199.07 frames. ], batch size: 60, lr: 9.67e-03, grad_scale: 16.0 2023-09-29 11:45:57,585 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 11:46:01,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:46:02,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:05,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:07,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 11:46:07,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:46:07,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=354473.3333333333, ans=0.125 2023-09-29 11:46:08,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:46:09,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:11,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:11,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=354540.0, ans=0.0 2023-09-29 11:46:14,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:16,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=354540.0, ans=15.0 2023-09-29 11:46:17,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 11:46:17,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:24,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:46:26,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=354606.6666666667, ans=0.2 2023-09-29 11:46:28,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 11:46:30,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 11:46:32,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:46:32,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=354606.6666666667, ans=0.125 2023-09-29 11:46:34,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:46:34,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:34,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:46:35,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:46:35,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:46:35,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:43,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:46:45,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:46:45,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:46:46,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 11:46:48,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:46:49,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:46:49,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 11:46:50,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:51,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 11:46:55,322 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.96 vs. limit=15.0 2023-09-29 11:47:01,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:01,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:47:02,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:04,546 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.921e+02 2.105e+02 2.466e+02 3.711e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 11:47:04,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:04,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:05,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=354740.0, ans=0.2 2023-09-29 11:47:07,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 11:47:07,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 11:47:09,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:09,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:11,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:47:12,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:47:12,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 11:47:14,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 11:47:14,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:47:16,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:17,338 INFO [train.py:1039] (2/4) Epoch 11, batch 100, loss[loss=0.1952, simple_loss=0.2718, pruned_loss=0.05934, over 24661.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2738, pruned_loss=0.06597, over 1882989.82 frames. ], batch size: 65, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:47:17,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:47:18,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 11:47:18,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 11:47:19,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:20,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:22,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:47:22,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:47:26,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:47:28,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:47:32,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:34,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 11:47:34,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:36,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:47:36,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:38,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:38,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:38,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:40,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 11:47:42,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:47:42,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:42,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:42,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:47,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 11:47:47,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:49,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:49,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:47:51,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:47:53,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=354940.0, ans=0.125 2023-09-29 11:47:54,476 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 11:47:54,501 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 11:47:56,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:47:56,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:48:00,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:48:03,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:48:05,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:09,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:11,180 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 11:48:12,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:48:17,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:18,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:48:19,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:22,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:25,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:27,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:48:30,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:31,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:33,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:33,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:48:33,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:34,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 11:48:34,647 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 11:48:34,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:36,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:48:37,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:37,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:37,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:48:37,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:48:37,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:48:37,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:38,481 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.95 vs. limit=15.0 2023-09-29 11:48:39,660 INFO [train.py:1039] (2/4) Epoch 11, batch 150, loss[loss=0.1817, simple_loss=0.2599, pruned_loss=0.05179, over 24662.00 frames. ], tot_loss[loss=0.2047, simple_loss=0.2763, pruned_loss=0.06652, over 2523592.82 frames. ], batch size: 65, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:48:39,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:41,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:43,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:48:43,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:48:45,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:47,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=355140.0, ans=0.125 2023-09-29 11:48:48,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:48,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:48:48,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:52,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:53,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:58,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:59,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:02,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 11:49:02,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 11:49:02,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 11:49:07,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:49:07,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:49:07,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:49:08,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:49:08,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:10,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:10,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:11,817 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 11:49:13,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:20,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:25,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:49:26,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 11:49:29,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=355340.0, ans=15.0 2023-09-29 11:49:29,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:49:29,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:30,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:33,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:49:35,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:49:36,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:49:36,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:36,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 11:49:41,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:42,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:49:42,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:49:42,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:49:43,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=355406.6666666667, ans=0.125 2023-09-29 11:49:44,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:46,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 11:49:47,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.912e+02 2.159e+02 2.654e+02 4.388e+02, threshold=4.317e+02, percent-clipped=1.0 2023-09-29 11:49:48,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:49:48,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=355406.6666666667, ans=0.0 2023-09-29 11:49:50,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:49:54,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:49:55,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:49:55,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 11:49:56,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=355406.6666666667, ans=22.5 2023-09-29 11:49:57,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:57,076 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 11:49:58,063 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.16 vs. limit=22.5 2023-09-29 11:50:01,514 INFO [train.py:1039] (2/4) Epoch 11, batch 200, loss[loss=0.2034, simple_loss=0.2768, pruned_loss=0.06501, over 23951.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2775, pruned_loss=0.06794, over 3001222.35 frames. ], batch size: 86, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:50:01,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:04,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=355473.3333333333, ans=0.1 2023-09-29 11:50:05,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:50:05,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:50:08,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 11:50:09,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:09,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:13,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 11:50:14,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:50:16,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:17,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:20,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:50:22,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:22,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:36,134 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.28 vs. limit=12.0 2023-09-29 11:50:43,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:50:43,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:50:44,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:50:44,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:50:46,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:50:46,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:50:47,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:48,565 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.21 vs. limit=10.0 2023-09-29 11:50:49,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:50:50,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:50,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:50:51,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=355673.3333333333, ans=0.125 2023-09-29 11:50:52,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 11:50:53,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:50:53,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:58,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=355673.3333333333, ans=0.125 2023-09-29 11:50:58,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=355673.3333333333, ans=0.125 2023-09-29 11:51:00,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:51:02,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=355673.3333333333, ans=0.2 2023-09-29 11:51:05,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:51:12,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:14,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:51:14,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=355740.0, ans=0.125 2023-09-29 11:51:22,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:23,779 INFO [train.py:1039] (2/4) Epoch 11, batch 250, loss[loss=0.2171, simple_loss=0.2703, pruned_loss=0.082, over 23596.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2759, pruned_loss=0.06708, over 3386408.28 frames. ], batch size: 256, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:51:25,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 11:51:25,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:25,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:51:25,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:26,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:51:28,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 11:51:29,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:51:29,994 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 11:51:31,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:33,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:51:33,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:35,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:37,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:51:37,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:40,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:51:44,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:51:44,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=355873.3333333333, ans=0.125 2023-09-29 11:51:54,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:51:57,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:57,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:51:57,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.01 vs. limit=15.0 2023-09-29 11:52:03,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:52:05,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:52:05,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=355940.0, ans=0.95 2023-09-29 11:52:07,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:52:07,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:07,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:52:07,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:52:07,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:08,119 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.73 vs. limit=15.0 2023-09-29 11:52:09,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=355940.0, ans=0.0 2023-09-29 11:52:10,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:52:10,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=355940.0, ans=0.125 2023-09-29 11:52:12,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 11:52:12,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:52:14,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=356006.6666666667, ans=0.2 2023-09-29 11:52:15,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:52:15,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:52:15,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:52:15,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=356006.6666666667, ans=0.95 2023-09-29 11:52:17,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:19,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:52:19,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:52:20,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:23,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:52:23,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:27,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:52:27,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=356006.6666666667, ans=0.2 2023-09-29 11:52:30,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:32,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:52:33,628 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.944e+02 2.181e+02 2.498e+02 3.489e+02, threshold=4.363e+02, percent-clipped=0.0 2023-09-29 11:52:38,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:41,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:52:45,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 11:52:46,474 INFO [train.py:1039] (2/4) Epoch 11, batch 300, loss[loss=0.2013, simple_loss=0.2883, pruned_loss=0.05718, over 24667.00 frames. ], tot_loss[loss=0.2025, simple_loss=0.2731, pruned_loss=0.06591, over 3676290.91 frames. ], batch size: 73, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:52:46,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:52:46,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:48,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 11:52:48,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:52:49,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:52:49,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 11:52:50,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.29 vs. limit=15.0 2023-09-29 11:52:53,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:55,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:53:00,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:53:00,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 11:53:01,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:53:03,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:53:03,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 11:53:03,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:08,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:53:11,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:53:11,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 11:53:15,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 11:53:15,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:18,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:19,250 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.78 vs. limit=12.0 2023-09-29 11:53:20,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:20,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 11:53:20,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:53:23,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:53:25,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:53:27,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:53:27,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=356273.3333333333, ans=0.2 2023-09-29 11:53:33,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:53:33,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 11:53:33,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:53:33,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=356273.3333333333, ans=0.0 2023-09-29 11:53:36,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:37,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 11:53:39,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:53:42,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:53:47,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:53:47,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 11:53:51,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:51,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:53:54,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:56,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:53:56,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 11:53:57,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:53:59,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:00,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 11:54:02,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:54:03,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:05,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:05,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:06,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:09,944 INFO [train.py:1039] (2/4) Epoch 11, batch 350, loss[loss=0.1955, simple_loss=0.2802, pruned_loss=0.05534, over 24292.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2725, pruned_loss=0.06668, over 3903416.83 frames. ], batch size: 74, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:54:11,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:11,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:54:14,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:19,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:21,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:22,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:27,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 11:54:29,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:29,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 11:54:33,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:33,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 11:54:35,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:37,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 11:54:39,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:54:41,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:42,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:54:44,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:54:44,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:46,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:54:47,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:54:47,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:56,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=356606.6666666667, ans=0.2 2023-09-29 11:54:57,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:54:57,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:54:57,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:54:57,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:02,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 11:55:02,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:55:08,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:08,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:08,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:55:10,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 11:55:12,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:14,070 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 11:55:16,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 11:55:16,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:19,164 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.993e+02 2.217e+02 2.521e+02 3.405e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 11:55:19,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:55:19,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 11:55:22,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:23,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.30 vs. limit=15.0 2023-09-29 11:55:25,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:55:25,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:26,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:26,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:30,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:32,090 INFO [train.py:1039] (2/4) Epoch 11, batch 400, loss[loss=0.1951, simple_loss=0.2799, pruned_loss=0.0551, over 24295.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2717, pruned_loss=0.06596, over 4090034.61 frames. ], batch size: 74, lr: 9.64e-03, grad_scale: 32.0 2023-09-29 11:55:33,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:55:37,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:55:38,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 11:55:38,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:38,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:41,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:55:42,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:42,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=356806.6666666667, ans=0.1 2023-09-29 11:55:45,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:47,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:49,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 11:55:51,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 11:55:51,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:52,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 11:55:52,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=356873.3333333333, ans=0.05 2023-09-29 11:55:53,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:57,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:55:57,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:57,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 11:55:57,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:55:58,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:58,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:56:00,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:56:01,773 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 11:56:01,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 11:56:03,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=356940.0, ans=0.0 2023-09-29 11:56:07,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:56:10,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:10,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 11:56:11,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 11:56:13,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:56:15,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:24,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 11:56:27,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:56:27,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=357006.6666666667, ans=0.2 2023-09-29 11:56:28,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 11:56:30,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:56:31,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:56:31,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 11:56:36,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:56:41,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:56:42,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:45,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:46,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 11:56:47,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=357073.3333333333, ans=0.125 2023-09-29 11:56:48,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:56:48,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=357073.3333333333, ans=0.1 2023-09-29 11:56:49,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 11:56:52,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:56:52,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:56:54,756 INFO [train.py:1039] (2/4) Epoch 11, batch 450, loss[loss=0.2108, simple_loss=0.278, pruned_loss=0.07179, over 23734.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2725, pruned_loss=0.06596, over 4229193.51 frames. ], batch size: 179, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:56:54,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 11:56:56,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:56:57,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:56:59,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:56:59,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 11:57:01,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:57:01,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:57:01,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:01,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 11:57:03,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:57:03,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:57:06,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:57:16,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:16,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:19,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 11:57:20,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.68 vs. limit=15.0 2023-09-29 11:57:21,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 11:57:23,605 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.85 vs. limit=15.0 2023-09-29 11:57:24,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:57:26,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:28,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:32,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:34,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:36,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 11:57:37,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 11:57:40,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 11:57:40,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:57:40,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:42,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:57:44,220 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 11:57:44,234 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 11:57:44,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:44,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:57:44,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=357340.0, ans=0.2 2023-09-29 11:57:46,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:57:48,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:57:49,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:51,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 11:57:52,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 11:57:54,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:57,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:57:57,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:58:00,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 11:58:04,477 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.885e+02 2.166e+02 2.451e+02 4.204e+02, threshold=4.332e+02, percent-clipped=0.0 2023-09-29 11:58:04,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:58:06,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 11:58:06,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=357406.6666666667, ans=0.125 2023-09-29 11:58:07,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 11:58:09,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:58:16,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:58:17,677 INFO [train.py:1039] (2/4) Epoch 11, batch 500, loss[loss=0.2223, simple_loss=0.2945, pruned_loss=0.07505, over 23826.00 frames. ], tot_loss[loss=0.2028, simple_loss=0.2734, pruned_loss=0.06609, over 4336525.17 frames. ], batch size: 85, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:58:17,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:19,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:58:19,443 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 11:58:24,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:24,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:58:26,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:26,151 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 11:58:27,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 11:58:27,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:30,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:58:32,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:58:35,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:58:37,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:37,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:38,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=357540.0, ans=0.07 2023-09-29 11:58:39,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:58:49,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:50,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.83 vs. limit=12.0 2023-09-29 11:58:50,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 11:58:50,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:58:50,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:52,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 11:58:52,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:58:55,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:58:55,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:58:57,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:58:57,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:58,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 11:59:02,563 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 11:59:06,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:07,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:08,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.59 vs. limit=22.5 2023-09-29 11:59:09,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:59:10,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 11:59:15,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:59:17,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:19,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=357673.3333333333, ans=0.125 2023-09-29 11:59:20,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:23,016 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:59:24,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:31,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:35,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 11:59:35,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:35,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:38,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 11:59:39,852 INFO [train.py:1039] (2/4) Epoch 11, batch 550, loss[loss=0.179, simple_loss=0.2543, pruned_loss=0.05183, over 24606.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2739, pruned_loss=0.0662, over 4411949.21 frames. ], batch size: 60, lr: 9.62e-03, grad_scale: 32.0 2023-09-29 11:59:39,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:59:41,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:46,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 11:59:48,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 11:59:48,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:50,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 11:59:50,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:59:50,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:52,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:59:53,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:59:56,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:56,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 11:59:58,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:00:03,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:04,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:05,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:06,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:10,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 12:00:10,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 12:00:13,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:00:16,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:00:16,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:19,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:00:23,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:23,376 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 12:00:23,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:23,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=357940.0, ans=0.1 2023-09-29 12:00:25,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:00:28,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:28,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:00:28,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:00:30,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:30,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=358006.6666666667, ans=0.125 2023-09-29 12:00:31,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 12:00:33,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 12:00:34,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:34,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:34,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:00:34,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:00:36,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=358006.6666666667, ans=0.5 2023-09-29 12:00:38,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:00:40,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:00:43,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:00:43,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:43,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 12:00:46,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:00:48,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:49,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:00:49,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:51,044 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.069e+02 2.330e+02 2.802e+02 5.186e+02, threshold=4.661e+02, percent-clipped=1.0 2023-09-29 12:00:51,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=358073.3333333333, ans=0.125 2023-09-29 12:00:53,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:00:53,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:00:59,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 12:01:02,386 INFO [train.py:1039] (2/4) Epoch 11, batch 600, loss[loss=0.2062, simple_loss=0.2847, pruned_loss=0.06389, over 24402.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.2748, pruned_loss=0.06672, over 4480544.92 frames. ], batch size: 69, lr: 9.62e-03, grad_scale: 16.0 2023-09-29 12:01:02,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 12:01:04,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:01:04,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:01:06,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:06,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=358140.0, ans=0.1 2023-09-29 12:01:14,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:01:14,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:01:17,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 12:01:20,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:01:20,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:22,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:22,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=358206.6666666667, ans=0.125 2023-09-29 12:01:25,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 12:01:25,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:01:31,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 12:01:34,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:01:34,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:34,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:01:39,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:01:39,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:01:41,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:49,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:01:49,912 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.61 vs. limit=15.0 2023-09-29 12:01:52,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=358340.0, ans=0.1 2023-09-29 12:01:53,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:54,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:54,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:02:02,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 12:02:06,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.58 vs. limit=15.0 2023-09-29 12:02:08,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:02:08,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:13,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 12:02:13,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:02:14,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=358406.6666666667, ans=0.125 2023-09-29 12:02:17,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 12:02:17,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:02:17,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:02:24,979 INFO [train.py:1039] (2/4) Epoch 11, batch 650, loss[loss=0.1871, simple_loss=0.2396, pruned_loss=0.06729, over 23559.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2732, pruned_loss=0.06698, over 4524906.01 frames. ], batch size: 256, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:02:25,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:02:26,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:02:29,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:02:31,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:02:33,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:02:35,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 12:02:37,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:02:43,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:02:43,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:45,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=358540.0, ans=0.07 2023-09-29 12:02:47,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:02:49,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=358540.0, ans=0.0 2023-09-29 12:02:51,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 12:02:53,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:02:55,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:58,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:58,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:03:01,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:01,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:02,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:03:03,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:05,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:03:09,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:03:09,486 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 12:03:09,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:09,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:13,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:14,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:14,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:14,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:03:16,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 12:03:18,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:03:18,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:03:18,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:03:18,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:19,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=358673.3333333333, ans=0.2 2023-09-29 12:03:20,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=358673.3333333333, ans=0.1 2023-09-29 12:03:21,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:03:23,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 12:03:24,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 12:03:24,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:24,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:24,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:03:26,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:03:28,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:03:33,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:33,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:03:35,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:37,067 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.924e+02 2.251e+02 2.757e+02 4.294e+02, threshold=4.503e+02, percent-clipped=0.0 2023-09-29 12:03:37,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:37,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:03:37,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:46,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:03:46,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:46,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:03:46,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:48,381 INFO [train.py:1039] (2/4) Epoch 11, batch 700, loss[loss=0.1955, simple_loss=0.2786, pruned_loss=0.05616, over 24629.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2724, pruned_loss=0.06637, over 4569901.50 frames. ], batch size: 73, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:03:52,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 12:03:52,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 12:03:52,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=358806.6666666667, ans=0.125 2023-09-29 12:03:55,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 12:03:56,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:58,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:03:58,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=358806.6666666667, ans=0.125 2023-09-29 12:04:00,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 12:04:03,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:09,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:04:10,043 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.01 vs. limit=15.0 2023-09-29 12:04:10,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:12,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:04:12,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:04:14,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=358873.3333333333, ans=0.125 2023-09-29 12:04:15,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:17,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:04:17,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:04:20,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 12:04:24,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 12:04:27,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:04:28,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:04:30,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:04:35,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:04:35,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 12:04:41,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:41,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:04:42,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 12:04:45,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:47,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:51,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:04:52,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=359006.6666666667, ans=0.125 2023-09-29 12:04:57,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:04:57,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 12:04:59,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 12:05:00,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 12:05:02,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=359073.3333333333, ans=0.0 2023-09-29 12:05:05,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:06,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:06,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=359073.3333333333, ans=0.125 2023-09-29 12:05:08,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:10,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:10,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 12:05:10,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=359140.0, ans=0.125 2023-09-29 12:05:12,363 INFO [train.py:1039] (2/4) Epoch 11, batch 750, loss[loss=0.2024, simple_loss=0.2808, pruned_loss=0.06202, over 24448.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2723, pruned_loss=0.06596, over 4613063.64 frames. ], batch size: 69, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:05:15,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 12:05:15,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 12:05:15,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 12:05:17,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 12:05:17,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 12:05:17,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:05:18,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 12:05:20,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:20,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:22,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=359140.0, ans=0.125 2023-09-29 12:05:23,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:26,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:26,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:05:26,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:28,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:05:29,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:05:31,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:05:33,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:34,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:36,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 12:05:36,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:05:36,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=359206.6666666667, ans=0.0 2023-09-29 12:05:39,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:39,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:41,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:05:43,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 12:05:43,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:45,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 12:05:45,170 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 12:05:45,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=359273.3333333333, ans=0.125 2023-09-29 12:05:47,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 12:05:47,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:05:47,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:05:50,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:05:57,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:57,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:05:57,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:06:00,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:06:02,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:04,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 12:06:05,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:06:06,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 12:06:07,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:06:12,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:06:12,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=359340.0, ans=0.125 2023-09-29 12:06:14,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 12:06:14,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:18,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:20,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:06:22,364 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.014e+02 2.278e+02 2.730e+02 4.361e+02, threshold=4.557e+02, percent-clipped=0.0 2023-09-29 12:06:22,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:25,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:06:29,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 12:06:29,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:30,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:32,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:33,782 INFO [train.py:1039] (2/4) Epoch 11, batch 800, loss[loss=0.2019, simple_loss=0.287, pruned_loss=0.0584, over 24451.00 frames. ], tot_loss[loss=0.2027, simple_loss=0.273, pruned_loss=0.06621, over 4641347.90 frames. ], batch size: 69, lr: 9.60e-03, grad_scale: 32.0 2023-09-29 12:06:33,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:35,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:35,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:06:41,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=359473.3333333333, ans=0.125 2023-09-29 12:06:45,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:45,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:47,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:47,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:50,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:50,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:52,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:55,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:57,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:07:01,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 12:07:02,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:02,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:07:02,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:04,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:04,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 12:07:04,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:04,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 12:07:05,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.45 vs. limit=10.0 2023-09-29 12:07:08,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:11,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:13,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:07:13,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:16,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:16,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:20,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:07:20,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:07:22,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 12:07:23,705 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 12:07:23,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 12:07:23,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:07:23,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:07:25,564 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:07:27,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:27,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:07:32,504 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 12:07:33,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 12:07:35,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:07:37,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:07:37,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=359673.3333333333, ans=0.2 2023-09-29 12:07:41,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:07:44,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:46,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 12:07:47,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:49,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=359740.0, ans=0.0 2023-09-29 12:07:50,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 12:07:55,864 INFO [train.py:1039] (2/4) Epoch 11, batch 850, loss[loss=0.206, simple_loss=0.2682, pruned_loss=0.07185, over 23500.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2736, pruned_loss=0.06671, over 4662386.68 frames. ], batch size: 134, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:07:56,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:07:57,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:07:59,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 12:08:00,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:08:00,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:02,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 12:08:02,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:05,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:08:06,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:08,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:08:10,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:08:10,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 12:08:11,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 12:08:11,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 12:08:13,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:08:13,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:08:15,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:15,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:16,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:08:21,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:21,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:21,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 12:08:25,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 12:08:29,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:31,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 12:08:34,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 12:08:36,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 12:08:40,060 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 12:08:40,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:40,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:08:40,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:08:43,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:45,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:47,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 12:08:48,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:50,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:50,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:08:50,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:08:51,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:08:53,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:08:54,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 12:08:57,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:08:58,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:08:59,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:08:59,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:09:01,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:03,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:09:04,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:09:06,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:09:07,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:07,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:09:09,287 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.084e+02 2.353e+02 2.728e+02 3.950e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 12:09:12,284 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.53 vs. limit=15.0 2023-09-29 12:09:16,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:09:18,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:09:19,683 INFO [train.py:1039] (2/4) Epoch 11, batch 900, loss[loss=0.2922, simple_loss=0.3327, pruned_loss=0.1259, over 19322.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2744, pruned_loss=0.06728, over 4674737.25 frames. ], batch size: 389, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:09:19,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 12:09:19,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:19,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:09:21,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=360140.0, ans=0.2 2023-09-29 12:09:22,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 12:09:27,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:09:27,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=360140.0, ans=0.1 2023-09-29 12:09:29,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=360140.0, ans=0.0 2023-09-29 12:09:30,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:32,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 12:09:35,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:09:37,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 12:09:38,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 12:09:38,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:38,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:09:40,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:09:40,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:09:40,960 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.01 vs. limit=15.0 2023-09-29 12:09:50,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:52,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:52,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:09:52,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=360273.3333333333, ans=0.0 2023-09-29 12:09:56,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:02,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 12:10:04,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:10:07,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:10:07,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:10:09,092 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 12:10:11,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 12:10:15,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:10:15,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:10:17,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:10:20,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=360340.0, ans=0.0 2023-09-29 12:10:22,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=360406.6666666667, ans=0.125 2023-09-29 12:10:24,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:24,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:10:27,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 12:10:27,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:28,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 12:10:30,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:10:31,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:31,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:10:31,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:10:37,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 12:10:37,930 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 12:10:39,405 INFO [train.py:1039] (2/4) Epoch 11, batch 950, loss[loss=0.2624, simple_loss=0.3154, pruned_loss=0.1047, over 19620.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2755, pruned_loss=0.06792, over 4664556.96 frames. ], batch size: 389, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:10:40,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:10:40,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 12:10:43,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:46,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 12:10:50,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:10:52,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:54,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:54,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:10:55,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=360540.0, ans=0.0 2023-09-29 12:10:58,468 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 12:11:01,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:01,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:03,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:03,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:11:03,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 12:11:04,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:11:06,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:06,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=360540.0, ans=0.2 2023-09-29 12:11:08,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 12:11:08,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:12,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:12,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:12,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:11:14,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 12:11:16,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:11:17,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:19,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:11:24,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:11:24,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:31,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 12:11:31,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:11:31,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:11:31,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:31,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:31,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:11:37,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 12:11:37,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:11:42,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:42,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:42,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 12:11:43,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:43,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:11:43,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 12:11:48,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:11:50,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:51,682 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.923e+02 2.189e+02 2.546e+02 4.043e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 12:11:53,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:11:53,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 12:11:55,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 12:12:00,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:12:02,419 INFO [train.py:1039] (2/4) Epoch 11, batch 1000, loss[loss=0.1773, simple_loss=0.26, pruned_loss=0.04736, over 24635.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2745, pruned_loss=0.06737, over 4678259.60 frames. ], batch size: 60, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:12:03,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.07 vs. limit=15.0 2023-09-29 12:12:05,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 12:12:05,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:10,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:12:11,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 12:12:11,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 12:12:16,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:16,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:12:19,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:21,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 12:12:24,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 12:12:26,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 12:12:26,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:29,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 12:12:31,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 12:12:31,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 12:12:33,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:35,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:44,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:44,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:12:45,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:47,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:47,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 12:12:47,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:49,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:12:49,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:49,200 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 12:12:55,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 12:12:55,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 12:12:56,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 12:13:00,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:13:09,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:09,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:13:10,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:10,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:13:12,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 12:13:13,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:13:13,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 12:13:15,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 12:13:16,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:16,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:13:18,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:13:20,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:13:21,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:13:23,132 INFO [train.py:1039] (2/4) Epoch 11, batch 1050, loss[loss=0.2171, simple_loss=0.2853, pruned_loss=0.07443, over 24440.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.273, pruned_loss=0.06606, over 4699280.03 frames. ], batch size: 77, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:13:24,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:13:26,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:13:27,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:13:29,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:30,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=361140.0, ans=0.125 2023-09-29 12:13:32,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:13:35,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:13:36,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:13:39,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:13:42,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:13:42,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:13:43,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:13:43,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=361206.6666666667, ans=0.125 2023-09-29 12:13:44,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 12:13:45,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:13:46,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 12:13:49,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:49,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 12:13:49,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:13:55,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:57,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:13:57,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:13:59,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.26 vs. limit=10.0 2023-09-29 12:14:00,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 12:14:00,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 12:14:00,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:14:03,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 12:14:07,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 12:14:07,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=361273.3333333333, ans=0.125 2023-09-29 12:14:08,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:09,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=361273.3333333333, ans=0.1 2023-09-29 12:14:12,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:14:12,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:14:14,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:14:16,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:14:21,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:14:24,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 12:14:26,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 12:14:26,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 12:14:26,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:27,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:14:27,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 12:14:32,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:14:33,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:33,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:14:35,259 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.858e+02 2.245e+02 2.592e+02 4.386e+02, threshold=4.489e+02, percent-clipped=1.0 2023-09-29 12:14:35,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:35,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:35,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=361406.6666666667, ans=10.0 2023-09-29 12:14:38,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=361406.6666666667, ans=0.2 2023-09-29 12:14:40,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:40,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 12:14:42,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:42,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 12:14:42,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 12:14:43,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:14:44,800 INFO [train.py:1039] (2/4) Epoch 11, batch 1100, loss[loss=0.1825, simple_loss=0.2547, pruned_loss=0.05514, over 18396.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2721, pruned_loss=0.06587, over 4697187.20 frames. ], batch size: 40, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:14:47,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:14:52,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:14:54,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=361473.3333333333, ans=0.0 2023-09-29 12:14:57,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:14:58,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=361473.3333333333, ans=0.1 2023-09-29 12:14:59,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:14:59,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:00,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 12:15:00,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:02,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:15:02,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=361540.0, ans=0.125 2023-09-29 12:15:04,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=361540.0, ans=0.0 2023-09-29 12:15:05,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:15:08,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:15:08,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 12:15:10,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:15:10,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=361540.0, ans=0.125 2023-09-29 12:15:12,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:12,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:15:13,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:15:15,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:15:19,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:15:22,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 12:15:24,377 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 12:15:24,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:28,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:29,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:15:29,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:15:31,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 12:15:31,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:15:31,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:15:31,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:15:32,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:32,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 12:15:39,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:15:39,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 12:15:40,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:15:44,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:15:50,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 12:15:50,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:15:51,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:54,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:56,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:57,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 12:15:59,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:15:59,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:16:00,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 12:16:02,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:16:02,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 12:16:04,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:16:04,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:16:05,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:16:08,851 INFO [train.py:1039] (2/4) Epoch 11, batch 1150, loss[loss=0.2138, simple_loss=0.2976, pruned_loss=0.06496, over 23975.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2735, pruned_loss=0.06662, over 4709940.42 frames. ], batch size: 80, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:16:09,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:10,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:16:13,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:13,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:16:13,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 12:16:15,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:18,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 12:16:18,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:18,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:16:25,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 12:16:27,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:30,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:32,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:33,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 12:16:33,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:16:33,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:40,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 12:16:40,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:41,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:47,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=361940.0, ans=0.0 2023-09-29 12:16:50,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=361940.0, ans=0.1 2023-09-29 12:16:51,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:55,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=361940.0, ans=0.5 2023-09-29 12:16:56,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=362006.6666666667, ans=0.0 2023-09-29 12:16:58,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:17:00,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 12:17:00,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:00,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:08,480 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 12:17:10,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:19,603 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 12:17:21,037 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.993e+02 2.217e+02 2.557e+02 3.633e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 12:17:24,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:25,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:17:25,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:17:25,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:17:29,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=362140.0, ans=0.2 2023-09-29 12:17:30,748 INFO [train.py:1039] (2/4) Epoch 11, batch 1200, loss[loss=0.1791, simple_loss=0.253, pruned_loss=0.05254, over 24277.00 frames. ], tot_loss[loss=0.2039, simple_loss=0.2739, pruned_loss=0.06693, over 4715440.41 frames. ], batch size: 56, lr: 9.57e-03, grad_scale: 32.0 2023-09-29 12:17:30,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:31,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=15.0 2023-09-29 12:17:36,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:17:36,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:17:41,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:17:41,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:42,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:17:44,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:17:45,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:17:47,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:47,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:48,921 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 12:17:51,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 12:17:55,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:17:58,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:17:58,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=362206.6666666667, ans=0.125 2023-09-29 12:18:00,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:01,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:01,845 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 12:18:03,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:03,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=362273.3333333333, ans=0.1 2023-09-29 12:18:07,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.42 vs. limit=15.0 2023-09-29 12:18:12,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:18:12,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:18:12,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 12:18:12,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:18:17,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 12:18:20,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 12:18:20,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:20,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:18:22,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:22,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:18:25,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:25,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:18:25,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:18:27,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 12:18:28,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:18:29,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:29,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:18:30,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:30,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:35,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:18:36,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:18:40,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 12:18:46,498 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 12:18:48,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:48,886 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.67 vs. limit=10.0 2023-09-29 12:18:51,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:52,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:18:53,996 INFO [train.py:1039] (2/4) Epoch 11, batch 1250, loss[loss=0.1912, simple_loss=0.2753, pruned_loss=0.05352, over 24656.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2736, pruned_loss=0.06672, over 4724006.54 frames. ], batch size: 68, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:18:54,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:57,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 12:19:01,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:19:03,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:05,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 12:19:05,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=362473.3333333333, ans=0.125 2023-09-29 12:19:07,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:19:08,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:19:08,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=362540.0, ans=0.125 2023-09-29 12:19:13,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:19:13,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:14,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:19:14,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:18,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:19:23,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:19:23,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:19:24,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:19:26,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:27,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:28,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=362606.6666666667, ans=0.1 2023-09-29 12:19:29,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=362606.6666666667, ans=0.125 2023-09-29 12:19:30,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:32,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:19:37,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 12:19:37,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:19:38,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:19:39,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 12:19:40,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:40,982 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 12:19:41,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:41,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:46,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=362673.3333333333, ans=0.125 2023-09-29 12:19:47,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:48,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:50,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:19:51,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 12:19:51,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 12:19:53,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 12:19:57,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:19:59,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 12:19:59,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:01,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=362740.0, ans=0.125 2023-09-29 12:20:02,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:20:02,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:20:05,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 12:20:05,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:20:05,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:20:06,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:20:06,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:08,092 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.898e+02 2.110e+02 2.286e+02 3.124e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-29 12:20:08,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 12:20:11,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:12,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:20:14,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:20:16,397 INFO [train.py:1039] (2/4) Epoch 11, batch 1300, loss[loss=0.2105, simple_loss=0.2641, pruned_loss=0.07848, over 22666.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2736, pruned_loss=0.06643, over 4722655.19 frames. ], batch size: 322, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:20:16,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:20:18,890 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.47 vs. limit=10.0 2023-09-29 12:20:21,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:21,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 12:20:27,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:29,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:20:29,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:20:31,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:31,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:20:33,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 12:20:39,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:20:40,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:20:40,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 12:20:44,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:20:47,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:48,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:50,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:50,983 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:20:52,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:53,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:20:55,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:20:55,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 12:21:01,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:21:01,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:21:04,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 12:21:04,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:21:04,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=363006.6666666667, ans=0.0 2023-09-29 12:21:06,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:21:08,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:21:09,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 12:21:09,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:09,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 12:21:12,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:15,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:21:15,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:21:18,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 12:21:20,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 12:21:20,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=363073.3333333333, ans=0.2 2023-09-29 12:21:21,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 12:21:25,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:21:28,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 12:21:30,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:34,160 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.40 vs. limit=15.0 2023-09-29 12:21:39,069 INFO [train.py:1039] (2/4) Epoch 11, batch 1350, loss[loss=0.2041, simple_loss=0.2595, pruned_loss=0.07434, over 23761.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2735, pruned_loss=0.0667, over 4704626.82 frames. ], batch size: 164, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:21:39,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 12:21:42,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:45,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:21:48,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:48,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:50,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:21:50,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:54,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:56,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 12:21:58,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:21:58,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:22:01,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 12:22:03,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:22:03,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:22:03,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 12:22:06,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 12:22:09,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 12:22:09,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=363206.6666666667, ans=0.125 2023-09-29 12:22:10,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:11,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 12:22:22,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:30,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=363340.0, ans=0.0 2023-09-29 12:22:32,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:32,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:32,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 12:22:35,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=363340.0, ans=0.125 2023-09-29 12:22:36,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:37,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 12:22:37,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:22:39,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:22:41,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:22:44,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 12:22:46,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:22:52,344 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.973e+02 2.286e+02 2.663e+02 4.619e+02, threshold=4.571e+02, percent-clipped=1.0 2023-09-29 12:22:52,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 12:22:54,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 12:22:59,930 INFO [train.py:1039] (2/4) Epoch 11, batch 1400, loss[loss=0.18, simple_loss=0.2522, pruned_loss=0.05386, over 24591.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2718, pruned_loss=0.06673, over 4701230.88 frames. ], batch size: 60, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:23:01,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 12:23:01,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=363473.3333333333, ans=0.1 2023-09-29 12:23:01,993 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:23:03,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:23:04,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:23:04,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:23:05,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=363473.3333333333, ans=0.125 2023-09-29 12:23:13,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 12:23:14,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 12:23:21,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=363540.0, ans=0.125 2023-09-29 12:23:22,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=363540.0, ans=0.2 2023-09-29 12:23:28,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:23:29,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:31,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:23:31,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:23:33,047 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:23:35,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:23:35,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:23:44,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.82 vs. limit=15.0 2023-09-29 12:23:47,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:49,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:49,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=363673.3333333333, ans=0.0 2023-09-29 12:23:52,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 12:23:55,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:23:56,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:23:56,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:23:56,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:58,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:23:58,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:23:58,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:24:01,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 12:24:01,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:24:01,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=363673.3333333333, ans=0.2 2023-09-29 12:24:06,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:09,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:24:14,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 12:24:15,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:24:17,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:24:21,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:24:21,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:22,651 INFO [train.py:1039] (2/4) Epoch 11, batch 1450, loss[loss=0.1833, simple_loss=0.27, pruned_loss=0.04831, over 24339.00 frames. ], tot_loss[loss=0.202, simple_loss=0.272, pruned_loss=0.06601, over 4712527.77 frames. ], batch size: 74, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:24:24,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:24:28,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:24:28,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=363806.6666666667, ans=0.125 2023-09-29 12:24:30,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=363806.6666666667, ans=0.125 2023-09-29 12:24:31,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:24:31,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:31,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:24:31,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=363806.6666666667, ans=0.125 2023-09-29 12:24:36,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:37,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:24:39,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:24:40,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 12:24:42,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:24:42,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 12:24:43,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:43,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:43,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 12:24:45,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:24:45,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:24:47,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 12:24:47,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:48,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:24:49,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=363873.3333333333, ans=0.125 2023-09-29 12:24:50,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:53,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:57,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:24:57,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:24:57,945 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:25:01,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:25:01,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:03,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:25:03,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:25:03,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:04,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:09,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 12:25:11,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:25:15,929 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 12:25:16,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:17,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:25:19,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:20,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 12:25:23,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:25,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 12:25:27,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 12:25:27,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:33,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:25:34,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:36,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 12:25:38,105 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.910e+02 2.158e+02 2.591e+02 3.926e+02, threshold=4.316e+02, percent-clipped=0.0 2023-09-29 12:25:38,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 12:25:39,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 12:25:41,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:42,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:25:45,890 INFO [train.py:1039] (2/4) Epoch 11, batch 1500, loss[loss=0.2258, simple_loss=0.295, pruned_loss=0.07832, over 24028.00 frames. ], tot_loss[loss=0.2028, simple_loss=0.2729, pruned_loss=0.06638, over 4724536.36 frames. ], batch size: 86, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:25:53,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 12:25:54,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:25:54,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:25:55,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:56,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:25:56,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:25:58,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 12:26:01,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:26:01,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:26:01,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:26:02,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:26:04,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:06,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:12,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:12,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 12:26:12,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:26:12,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=364206.6666666667, ans=0.07 2023-09-29 12:26:14,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:26:14,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:17,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 12:26:20,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=364273.3333333333, ans=0.125 2023-09-29 12:26:21,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 12:26:23,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:26:24,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 12:26:26,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:26:29,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:26:30,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:30,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:26:31,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=364273.3333333333, ans=0.125 2023-09-29 12:26:33,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 12:26:34,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:26:34,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:36,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 12:26:36,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:42,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:26:42,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 12:26:50,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:26:51,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:26:54,827 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 12:26:54,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:26:54,936 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 12:26:56,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:58,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:26:59,336 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 12:27:00,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:27:02,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 12:27:05,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:06,856 INFO [train.py:1039] (2/4) Epoch 11, batch 1550, loss[loss=0.2083, simple_loss=0.2859, pruned_loss=0.06532, over 24446.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2732, pruned_loss=0.06625, over 4725678.38 frames. ], batch size: 63, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:27:07,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:27:08,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=364473.3333333333, ans=0.1 2023-09-29 12:27:10,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 12:27:12,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 12:27:12,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:27:13,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 12:27:13,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 12:27:17,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:19,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:19,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:21,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:27:21,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:22,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:25,975 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 12:27:26,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:27,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:27:27,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:27:30,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:27:30,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 12:27:32,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:32,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 12:27:33,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 12:27:33,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 12:27:33,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:35,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:39,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:27:42,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 12:27:42,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 12:27:51,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:57,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:57,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:27:57,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:27:58,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 12:28:01,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:28:03,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:06,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:28:08,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:28:08,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:08,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 12:28:09,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:09,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=364673.3333333333, ans=0.2 2023-09-29 12:28:11,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:28:11,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=364740.0, ans=0.2 2023-09-29 12:28:12,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:14,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:28:14,080 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 12:28:15,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:21,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=364740.0, ans=0.125 2023-09-29 12:28:22,363 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.938e+02 2.255e+02 2.720e+02 4.386e+02, threshold=4.510e+02, percent-clipped=1.0 2023-09-29 12:28:22,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 12:28:28,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:29,656 INFO [train.py:1039] (2/4) Epoch 11, batch 1600, loss[loss=0.1924, simple_loss=0.2692, pruned_loss=0.05784, over 24332.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2735, pruned_loss=0.06615, over 4729963.26 frames. ], batch size: 74, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:28:29,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:31,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 12:28:32,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:34,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:34,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:28:34,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:28:34,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:28:37,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:37,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 12:28:38,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=364806.6666666667, ans=0.125 2023-09-29 12:28:39,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 12:28:40,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 12:28:42,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=364806.6666666667, ans=0.0 2023-09-29 12:28:43,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:28:45,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 12:28:45,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:28:48,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:28:53,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:55,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 12:28:59,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:28:59,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 12:29:01,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:02,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 12:29:08,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 12:29:15,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 12:29:15,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:29:15,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:29:18,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 12:29:23,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 12:29:26,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:29:26,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:28,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:30,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:29:32,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:29:32,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=365006.6666666667, ans=0.125 2023-09-29 12:29:34,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:29:35,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:29:41,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:43,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:29:46,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 12:29:46,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:29:46,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 12:29:50,777 INFO [train.py:1039] (2/4) Epoch 11, batch 1650, loss[loss=0.1699, simple_loss=0.2456, pruned_loss=0.04705, over 24650.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2739, pruned_loss=0.06669, over 4724780.11 frames. ], batch size: 60, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:29:50,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:29:52,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:29:53,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:29:53,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 12:29:53,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 12:29:53,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 12:29:55,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 12:29:58,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:58,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:00,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:00,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:30:02,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:04,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 12:30:07,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:30:07,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:07,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:30:07,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:30:08,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 12:30:08,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 12:30:09,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=365206.6666666667, ans=0.125 2023-09-29 12:30:16,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:30:18,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:30:24,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=365273.3333333333, ans=0.125 2023-09-29 12:30:27,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 12:30:27,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:28,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 12:30:34,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:36,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:30:37,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:30:37,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:30:39,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:30:39,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:43,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:45,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:45,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:45,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:45,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:45,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=365340.0, ans=10.0 2023-09-29 12:30:48,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:30:49,133 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.44 vs. limit=15.0 2023-09-29 12:30:51,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:51,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 12:30:52,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:54,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 12:30:55,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 12:30:57,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 12:30:57,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:57,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:30:58,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:58,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:58,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 12:30:59,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.18 vs. limit=15.0 2023-09-29 12:31:02,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:31:04,945 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.934e+02 2.140e+02 2.424e+02 3.897e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 12:31:05,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:31:05,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:09,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 12:31:11,712 INFO [train.py:1039] (2/4) Epoch 11, batch 1700, loss[loss=0.1888, simple_loss=0.2664, pruned_loss=0.05558, over 24476.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.273, pruned_loss=0.06644, over 4726828.55 frames. ], batch size: 63, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:31:11,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:11,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:31:14,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 12:31:14,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:15,076 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.69 vs. limit=15.0 2023-09-29 12:31:15,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:31:15,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:15,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=365473.3333333333, ans=0.125 2023-09-29 12:31:17,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:31:17,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:31:17,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 12:31:20,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:31:30,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:33,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:31:37,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:31:37,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:31:38,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:38,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:31:42,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 12:31:45,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:31:45,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:48,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:31:49,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:31:50,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=365606.6666666667, ans=0.0 2023-09-29 12:31:51,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 12:31:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 12:31:54,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:54,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 12:31:55,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:31:59,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=365673.3333333333, ans=0.125 2023-09-29 12:32:03,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:04,700 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.56 vs. limit=12.0 2023-09-29 12:32:05,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:06,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:32:08,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:32:08,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 12:32:08,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:32:10,584 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.35 vs. limit=15.0 2023-09-29 12:32:11,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:11,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 12:32:11,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:32:11,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:11,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=365673.3333333333, ans=0.04949747468305833 2023-09-29 12:32:13,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:13,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:17,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:17,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:32:17,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:19,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:32:19,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:19,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=365740.0, ans=0.0 2023-09-29 12:32:24,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:24,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 12:32:26,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:27,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:29,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 12:32:34,315 INFO [train.py:1039] (2/4) Epoch 11, batch 1750, loss[loss=0.1994, simple_loss=0.2599, pruned_loss=0.06939, over 23569.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2714, pruned_loss=0.06608, over 4723052.23 frames. ], batch size: 256, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:32:34,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:34,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=365806.6666666667, ans=0.1 2023-09-29 12:32:37,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:37,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:32:38,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 12:32:38,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:42,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:32:42,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:47,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 12:32:50,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:54,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 12:32:54,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:56,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:32:59,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:33:00,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 12:33:02,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:33:02,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 12:33:11,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:33:14,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:14,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:15,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=15.0 2023-09-29 12:33:19,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:19,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:21,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:33:25,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:27,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:29,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:33:29,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 12:33:32,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:35,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 12:33:35,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:36,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=366006.6666666667, ans=0.1 2023-09-29 12:33:39,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:39,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:33:43,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:33:43,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:33:45,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:46,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:49,677 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.016e+02 2.265e+02 2.513e+02 4.125e+02, threshold=4.530e+02, percent-clipped=0.0 2023-09-29 12:33:51,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:54,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:33:55,897 INFO [train.py:1039] (2/4) Epoch 11, batch 1800, loss[loss=0.203, simple_loss=0.2846, pruned_loss=0.06071, over 24551.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2711, pruned_loss=0.06576, over 4720599.13 frames. ], batch size: 71, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:33:56,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:33:56,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 12:33:56,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:57,694 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=15.18 vs. limit=15.0 2023-09-29 12:33:58,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:33:58,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:33:58,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:33:59,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:33:59,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:34:04,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:34:04,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:34:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:34:05,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=366140.0, ans=0.125 2023-09-29 12:34:09,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:34:10,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:34:13,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:34:17,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:18,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:34:23,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:34:23,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 12:34:23,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:26,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:31,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 12:34:33,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 12:34:34,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 12:34:34,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:36,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:36,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:34:37,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:34:44,444 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 12:34:44,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=366273.3333333333, ans=0.1 2023-09-29 12:34:44,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=366273.3333333333, ans=0.0 2023-09-29 12:34:45,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:34:47,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:50,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 12:34:50,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 12:34:52,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:34:53,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:34:55,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:35:00,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 12:35:04,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:06,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 12:35:07,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:35:07,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:07,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:35:09,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 12:35:12,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:35:12,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:16,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 12:35:16,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:19,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:19,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:35:19,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,029 INFO [train.py:1039] (2/4) Epoch 11, batch 1850, loss[loss=0.1845, simple_loss=0.2642, pruned_loss=0.05243, over 24548.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2716, pruned_loss=0.06595, over 4718595.72 frames. ], batch size: 60, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:35:21,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:35:24,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:35:24,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:27,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:35:28,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:35:35,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:35:35,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 12:35:40,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 12:35:43,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 12:35:48,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:48,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 12:35:48,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:35:57,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:59,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 12:36:03,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:03,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:05,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 12:36:05,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:05,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:36:07,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:36:10,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:36:12,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:36:14,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=366673.3333333333, ans=0.125 2023-09-29 12:36:17,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:36:17,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:17,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:36:17,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:20,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:22,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:36:26,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 12:36:27,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:32,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:36:32,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:36:32,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 12:36:32,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 12:36:33,774 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 12:36:35,306 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 12:36:36,671 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.114e+02 2.440e+02 2.839e+02 4.239e+02, threshold=4.880e+02, percent-clipped=0.0 2023-09-29 12:36:36,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:36:36,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:36,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:36:38,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:38,409 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 12:36:38,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:36:39,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:39,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:36:40,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=366740.0, ans=0.07 2023-09-29 12:36:41,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:36:42,842 INFO [train.py:1039] (2/4) Epoch 11, batch 1900, loss[loss=0.2188, simple_loss=0.2853, pruned_loss=0.0761, over 23843.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2727, pruned_loss=0.066, over 4718460.81 frames. ], batch size: 195, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:36:43,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:43,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 12:36:45,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=366806.6666666667, ans=0.125 2023-09-29 12:36:46,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:46,213 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 12:36:46,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:36:47,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:55,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:56,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:36:57,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=366806.6666666667, ans=0.1 2023-09-29 12:36:58,088 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 12:37:00,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 12:37:00,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:37:01,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:37:01,855 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 12:37:01,895 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 12:37:07,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 12:37:09,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:37:10,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=366873.3333333333, ans=0.0 2023-09-29 12:37:13,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 12:37:15,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 12:37:23,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 12:37:25,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 12:37:25,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:37:27,479 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 12:37:27,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 12:37:27,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 12:37:28,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 12:37:28,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:37:33,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 12:37:35,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:37:40,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:37:40,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 12:37:43,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:37:47,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 12:37:47,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:37:56,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:37:56,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:37:56,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:37:57,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:37:59,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:37:59,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:38:01,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:38:04,371 INFO [train.py:1039] (2/4) Epoch 11, batch 1950, loss[loss=0.1663, simple_loss=0.2484, pruned_loss=0.04213, over 24482.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2728, pruned_loss=0.06589, over 4729972.82 frames. ], batch size: 63, lr: 9.50e-03, grad_scale: 16.0 2023-09-29 12:38:04,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:04,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:07,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:38:07,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:07,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:38:10,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:13,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:16,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:38:16,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:16,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:38:18,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 12:38:19,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:38:20,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:21,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:24,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:38:24,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:24,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=367206.6666666667, ans=0.025 2023-09-29 12:38:25,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:27,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:38:32,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:32,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:38:32,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:38:32,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:35,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:39,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:39,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:39,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:38:39,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 12:38:39,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:38:40,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:38:41,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:44,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:46,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:53,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:38:54,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:38:56,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:38:56,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 12:38:56,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:00,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:39:01,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:39:02,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:09,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:11,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:14,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:17,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:19,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:39:19,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:21,450 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.994e+02 2.316e+02 2.639e+02 3.669e+02, threshold=4.632e+02, percent-clipped=0.0 2023-09-29 12:39:21,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 12:39:21,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:39:21,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:39:23,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 12:39:24,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:27,713 INFO [train.py:1039] (2/4) Epoch 11, batch 2000, loss[loss=0.184, simple_loss=0.2487, pruned_loss=0.05962, over 23675.00 frames. ], tot_loss[loss=0.2042, simple_loss=0.274, pruned_loss=0.0672, over 4717048.73 frames. ], batch size: 149, lr: 9.50e-03, grad_scale: 32.0 2023-09-29 12:39:29,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:30,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:39:32,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:33,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:39:35,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:37,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 12:39:39,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:39:42,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:39:44,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 12:39:46,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:39:46,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:46,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=367540.0, ans=0.0 2023-09-29 12:39:49,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=367540.0, ans=0.125 2023-09-29 12:39:50,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:39:51,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 12:39:54,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:58,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 12:39:59,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:40:02,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 12:40:02,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:03,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:03,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:40:03,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:05,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:06,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:06,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 12:40:08,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=367606.6666666667, ans=0.0 2023-09-29 12:40:11,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 12:40:11,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:11,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:12,179 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.59 vs. limit=15.0 2023-09-29 12:40:15,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:18,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:40:18,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:18,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:40:19,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:21,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:22,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:22,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:24,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:28,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:29,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 12:40:34,652 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.89 vs. limit=22.5 2023-09-29 12:40:35,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:40:35,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:40:41,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:44,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:40:44,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:40:48,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:50,390 INFO [train.py:1039] (2/4) Epoch 11, batch 2050, loss[loss=0.1943, simple_loss=0.2772, pruned_loss=0.05574, over 24288.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2733, pruned_loss=0.06663, over 4715242.37 frames. ], batch size: 74, lr: 9.49e-03, grad_scale: 32.0 2023-09-29 12:40:50,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:50,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=367806.6666666667, ans=0.1 2023-09-29 12:40:54,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:55,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:55,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.45 vs. limit=15.0 2023-09-29 12:41:01,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:41:03,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:41:03,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:41:05,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:07,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 12:41:07,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:41:08,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:10,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:41:18,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:18,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:21,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 12:41:24,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:26,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 12:41:26,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:29,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=367940.0, ans=0.04949747468305833 2023-09-29 12:41:30,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:32,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:34,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:41:34,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:35,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:41:35,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:41:35,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:41:39,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:41,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:41:41,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=368006.6666666667, ans=0.125 2023-09-29 12:41:43,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:41:44,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:47,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:41:54,936 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.24 vs. limit=22.5 2023-09-29 12:41:55,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:56,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=368073.3333333333, ans=0.0 2023-09-29 12:41:57,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 12:42:01,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:02,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:42:06,927 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.014e+02 2.317e+02 2.683e+02 4.007e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 12:42:07,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:42:08,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 12:42:12,413 INFO [train.py:1039] (2/4) Epoch 11, batch 2100, loss[loss=0.2001, simple_loss=0.2893, pruned_loss=0.05543, over 24672.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2717, pruned_loss=0.06621, over 4714183.35 frames. ], batch size: 68, lr: 9.49e-03, grad_scale: 16.0 2023-09-29 12:42:14,044 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 12:42:14,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:14,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:15,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:15,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:15,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 12:42:17,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 12:42:18,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:42:21,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:42:21,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:42:25,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:25,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:42:25,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 12:42:27,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:42:28,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 12:42:28,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 12:42:29,257 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.46 vs. limit=15.0 2023-09-29 12:42:32,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:42:32,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:42:32,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 12:42:32,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 12:42:38,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 12:42:38,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:39,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:42:40,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:45,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:42:46,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 12:42:46,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:42:46,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:42:46,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=368273.3333333333, ans=0.0 2023-09-29 12:42:48,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 12:42:48,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:48,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 12:42:50,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 12:42:50,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 12:42:51,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:42:53,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:42:56,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:58,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:59,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:01,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:01,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 12:43:01,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:01,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:02,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=368340.0, ans=0.125 2023-09-29 12:43:03,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:03,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 12:43:04,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 12:43:06,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 12:43:09,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:43:10,318 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.41 vs. limit=12.0 2023-09-29 12:43:12,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:43:12,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=368340.0, ans=0.125 2023-09-29 12:43:14,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 12:43:19,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:21,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:43:22,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:43:22,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:43:22,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:43:23,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:43:25,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:25,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:43:26,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:43:27,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:28,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 12:43:30,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 12:43:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:32,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:32,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:43:34,231 INFO [train.py:1039] (2/4) Epoch 11, batch 2150, loss[loss=0.2113, simple_loss=0.2677, pruned_loss=0.07749, over 22777.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2703, pruned_loss=0.06507, over 4714403.99 frames. ], batch size: 322, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:43:34,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:43:34,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:43:41,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:43:42,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:43,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.25 vs. limit=22.5 2023-09-29 12:43:44,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:44,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:43:44,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:45,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:43:51,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:51,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:43:51,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:43:56,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:56,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 12:44:02,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:04,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:44:05,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:05,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:44:07,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:07,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:44:07,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:44:09,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 12:44:12,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:44:12,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:13,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=368606.6666666667, ans=0.2 2023-09-29 12:44:14,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:14,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:44:16,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:44:17,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:17,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:44:19,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:19,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 12:44:19,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:44:22,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:24,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:25,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:27,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:44:29,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:29,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:29,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 12:44:32,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 12:44:32,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:44:32,953 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 12:44:33,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:33,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:44:33,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=368673.3333333333, ans=0.125 2023-09-29 12:44:34,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 12:44:34,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:44:34,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 12:44:34,677 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 12:44:34,678 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 12:44:34,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 12:44:37,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:39,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:39,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:44:40,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:42,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:44:45,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:45,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:52,250 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.938e+02 2.164e+02 2.545e+02 3.667e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-29 12:44:53,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:44:55,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 12:44:56,980 INFO [train.py:1039] (2/4) Epoch 11, batch 2200, loss[loss=0.182, simple_loss=0.2561, pruned_loss=0.0539, over 24321.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.27, pruned_loss=0.06475, over 4724919.03 frames. ], batch size: 56, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:44:57,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:44:59,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=368806.6666666667, ans=0.1 2023-09-29 12:45:01,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:01,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:45:01,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:05,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:45:06,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=368806.6666666667, ans=0.2 2023-09-29 12:45:07,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:45:08,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:45:08,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 12:45:10,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=368806.6666666667, ans=0.125 2023-09-29 12:45:13,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 12:45:15,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:45:20,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 12:45:20,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=368873.3333333333, ans=0.125 2023-09-29 12:45:22,404 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.11 vs. limit=15.0 2023-09-29 12:45:24,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:24,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:26,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:45:27,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.69 vs. limit=22.5 2023-09-29 12:45:28,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=368940.0, ans=0.0 2023-09-29 12:45:29,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:45:29,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 12:45:33,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:45:36,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:36,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:45:39,605 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.37 vs. limit=15.0 2023-09-29 12:45:40,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:45:41,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:43,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:45:45,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:49,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 12:45:50,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:45:51,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 12:45:55,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:55,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:45:55,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:55,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=369006.6666666667, ans=10.0 2023-09-29 12:45:57,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:58,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:58,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:00,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:01,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:46:01,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:46:01,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=369073.3333333333, ans=0.125 2023-09-29 12:46:04,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:46:06,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=369073.3333333333, ans=22.5 2023-09-29 12:46:07,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:46:07,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:11,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:46:12,603 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 12:46:12,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=369073.3333333333, ans=0.125 2023-09-29 12:46:14,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:46:14,743 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 12:46:16,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:46:17,065 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 12:46:19,846 INFO [train.py:1039] (2/4) Epoch 11, batch 2250, loss[loss=0.2108, simple_loss=0.2804, pruned_loss=0.07064, over 23407.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.271, pruned_loss=0.06539, over 4707436.96 frames. ], batch size: 93, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:46:19,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:20,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:46:21,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:25,110 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 12:46:25,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:46:26,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:32,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.16 vs. limit=22.5 2023-09-29 12:46:33,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:46:33,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:46:36,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:38,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:38,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:41,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 12:46:42,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:46:42,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:46:44,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 12:46:46,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:46:46,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:48,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:52,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:53,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:46:53,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:46:55,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 12:46:57,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:59,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:47:05,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:06,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:07,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:47:07,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:47:10,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:47:12,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:47:16,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:47:20,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:47:20,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=369340.0, ans=0.125 2023-09-29 12:47:24,309 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.43 vs. limit=15.0 2023-09-29 12:47:25,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:47:25,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:47:25,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:47:32,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:47:35,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:47:35,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 12:47:35,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:36,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:47:38,350 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.418e+02 2.043e+02 2.273e+02 2.723e+02 4.405e+02, threshold=4.547e+02, percent-clipped=1.0 2023-09-29 12:47:39,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 12:47:42,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:47:43,469 INFO [train.py:1039] (2/4) Epoch 11, batch 2300, loss[loss=0.1928, simple_loss=0.2818, pruned_loss=0.0519, over 24254.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2712, pruned_loss=0.06525, over 4714437.39 frames. ], batch size: 74, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:47:43,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:48,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:49,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:47:51,419 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 12:47:52,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:00,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:48:00,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:48:01,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:01,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=369540.0, ans=0.125 2023-09-29 12:48:03,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:03,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 12:48:03,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:48:05,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:05,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:48:07,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=369540.0, ans=0.125 2023-09-29 12:48:12,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:48:13,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:48:16,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:21,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:48:21,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:24,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:48:26,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:48:29,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:29,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:48:29,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:48:31,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 12:48:38,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:48:38,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:38,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:38,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:48:38,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:40,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 12:48:40,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:48:41,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 12:48:41,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:48:41,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:41,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 12:48:41,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=369673.3333333333, ans=0.125 2023-09-29 12:48:49,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:48:52,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:48:55,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:55,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:48:57,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:49:00,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:49:00,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:00,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=369740.0, ans=0.125 2023-09-29 12:49:02,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:49:04,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 12:49:05,529 INFO [train.py:1039] (2/4) Epoch 11, batch 2350, loss[loss=0.1982, simple_loss=0.2786, pruned_loss=0.05888, over 24433.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2736, pruned_loss=0.06681, over 4698261.29 frames. ], batch size: 66, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:49:11,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:49:12,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 12:49:17,103 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.78 vs. limit=6.0 2023-09-29 12:49:17,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 12:49:22,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:49:25,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:27,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:28,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 12:49:32,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:49:36,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 12:49:38,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:42,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:49:42,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:45,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:49:47,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 12:49:48,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:49:50,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:50,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:49:50,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:49:55,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:49:57,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 12:49:58,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:50:00,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:50:00,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:50:04,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 12:50:05,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:50:07,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 12:50:08,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:50:11,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 12:50:15,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 12:50:17,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:50:17,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:50:17,968 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 12:50:18,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 12:50:19,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 12:50:22,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:50:23,826 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.084e+02 2.454e+02 3.278e+02 4.890e+02, threshold=4.908e+02, percent-clipped=1.0 2023-09-29 12:50:25,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:50:29,135 INFO [train.py:1039] (2/4) Epoch 11, batch 2400, loss[loss=0.2098, simple_loss=0.2638, pruned_loss=0.07787, over 23812.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2728, pruned_loss=0.06572, over 4719897.79 frames. ], batch size: 212, lr: 9.46e-03, grad_scale: 32.0 2023-09-29 12:50:30,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:50:32,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:50:33,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 12:50:34,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 12:50:36,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=370140.0, ans=0.2 2023-09-29 12:50:39,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=370140.0, ans=0.2 2023-09-29 12:50:40,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:50:40,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:50:43,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 12:50:43,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:50:45,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:47,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 12:50:54,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:56,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 12:51:00,296 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.63 vs. limit=15.0 2023-09-29 12:51:02,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:51:07,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 12:51:12,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:13,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:17,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:18,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 12:51:18,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=370340.0, ans=0.0 2023-09-29 12:51:20,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:51:27,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:30,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:51:32,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:51:33,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:51:33,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:51:33,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:51:33,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:35,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:51:35,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:51:40,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:51:40,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:51:40,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 12:51:43,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 12:51:46,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:46,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:46,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 12:51:48,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 12:51:48,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 12:51:48,291 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 12:51:48,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=370406.6666666667, ans=0.2 2023-09-29 12:51:49,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 12:51:51,312 INFO [train.py:1039] (2/4) Epoch 11, batch 2450, loss[loss=0.1709, simple_loss=0.2451, pruned_loss=0.04831, over 24313.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2714, pruned_loss=0.06528, over 4724562.39 frames. ], batch size: 56, lr: 9.46e-03, grad_scale: 16.0 2023-09-29 12:51:51,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:51,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:51,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:51:54,680 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 12:51:54,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:54,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:51:59,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:51:59,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:51:59,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=370473.3333333333, ans=0.1 2023-09-29 12:52:03,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:03,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:04,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 12:52:10,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:52:10,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:15,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:52:15,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:52:15,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:52:15,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=370540.0, ans=0.07 2023-09-29 12:52:16,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 12:52:19,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:21,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:52:21,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=370540.0, ans=0.05 2023-09-29 12:52:21,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=370540.0, ans=15.0 2023-09-29 12:52:22,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:52:25,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:52:25,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:52:27,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=370606.6666666667, ans=0.0 2023-09-29 12:52:30,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 12:52:30,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:52:38,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:40,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:40,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:41,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:52:41,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:45,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:52:46,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 12:52:48,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:50,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:52:53,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:52:53,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:59,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:52:59,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 12:53:01,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:53:01,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:01,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 12:53:02,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:02,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:53:07,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:53:08,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=370740.0, ans=0.09899494936611666 2023-09-29 12:53:09,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=370740.0, ans=0.125 2023-09-29 12:53:11,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:11,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:53:12,850 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.006e+02 2.269e+02 2.537e+02 3.932e+02, threshold=4.538e+02, percent-clipped=0.0 2023-09-29 12:53:14,456 INFO [train.py:1039] (2/4) Epoch 11, batch 2500, loss[loss=0.1801, simple_loss=0.2566, pruned_loss=0.05177, over 24591.00 frames. ], tot_loss[loss=0.2001, simple_loss=0.2703, pruned_loss=0.06499, over 4725874.73 frames. ], batch size: 60, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:53:16,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 12:53:16,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:53:22,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:32,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:53:32,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:53:35,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:35,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 12:53:40,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=370873.3333333333, ans=0.0 2023-09-29 12:53:45,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:53:45,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:47,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:53:47,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:53:48,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 12:53:49,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=370940.0, ans=0.5 2023-09-29 12:53:50,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:50,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:53:50,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 12:53:50,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:52,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 12:53:52,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:52,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=370940.0, ans=0.125 2023-09-29 12:53:56,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:56,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:54:01,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:54:01,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 12:54:01,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:04,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:07,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:10,582 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=371006.6666666667, ans=0.035 2023-09-29 12:54:11,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:15,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:20,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:54:23,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 12:54:23,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:54:25,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:54:26,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:54:26,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:54:28,456 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 12:54:28,457 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 12:54:28,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 12:54:31,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:31,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=371073.3333333333, ans=0.04949747468305833 2023-09-29 12:54:33,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 12:54:33,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 12:54:35,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:36,962 INFO [train.py:1039] (2/4) Epoch 11, batch 2550, loss[loss=0.2214, simple_loss=0.2776, pruned_loss=0.08259, over 23617.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2706, pruned_loss=0.06489, over 4728851.15 frames. ], batch size: 256, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:54:37,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 12:54:40,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 12:54:42,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:45,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:45,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:54:48,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:48,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 12:54:50,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:54:54,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 12:54:55,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:54:57,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:00,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:55:00,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 12:55:00,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:00,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:00,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=371206.6666666667, ans=0.125 2023-09-29 12:55:02,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:05,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:55:05,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 12:55:05,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:55:06,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:06,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 12:55:19,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:55:24,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:24,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:24,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:26,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:55:28,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=371340.0, ans=0.04949747468305833 2023-09-29 12:55:31,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=371340.0, ans=0.09899494936611666 2023-09-29 12:55:34,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:37,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:37,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:55:37,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:55:37,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:55:37,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:55:41,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:41,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:48,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:55:48,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 12:55:48,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:55:49,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:49,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:55:51,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:55:51,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:55:57,878 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.868e+02 2.141e+02 2.517e+02 4.100e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 12:55:59,462 INFO [train.py:1039] (2/4) Epoch 11, batch 2600, loss[loss=0.2267, simple_loss=0.2888, pruned_loss=0.08232, over 23533.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2718, pruned_loss=0.06588, over 4712179.66 frames. ], batch size: 256, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:55:59,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:01,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:04,873 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 12:56:06,512 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 12:56:06,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:56:08,043 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 12:56:08,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 12:56:08,201 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 12:56:11,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:56:11,265 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 12:56:12,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 12:56:16,362 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 12:56:18,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:56:18,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=371540.0, ans=0.125 2023-09-29 12:56:20,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 12:56:21,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 12:56:23,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:56:23,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 12:56:25,971 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 12:56:26,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 12:56:26,129 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:56:34,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:34,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:34,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:34,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 12:56:36,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:56:42,639 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 12:56:49,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:49,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:50,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 12:56:52,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:56:52,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:52,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 12:56:55,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:56:57,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:58,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:02,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=371673.3333333333, ans=0.125 2023-09-29 12:57:03,911 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 12:57:03,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:03,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:57:08,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:57:10,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:57:10,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 12:57:11,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:57:13,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:13,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:15,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=371740.0, ans=0.125 2023-09-29 12:57:19,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 12:57:20,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:21,942 INFO [train.py:1039] (2/4) Epoch 11, batch 2650, loss[loss=0.2254, simple_loss=0.289, pruned_loss=0.08088, over 23478.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2729, pruned_loss=0.06663, over 4698718.48 frames. ], batch size: 285, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:57:23,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:57:28,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 12:57:28,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:29,418 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.44 vs. limit=22.5 2023-09-29 12:57:30,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:57:30,370 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 12:57:30,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:57:32,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:34,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:57:35,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:38,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:38,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 12:57:40,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:57:40,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:57:43,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 12:57:44,755 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 12:57:48,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:49,148 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.32 vs. limit=15.0 2023-09-29 12:57:49,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 12:57:49,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:57:49,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 12:57:55,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:55,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:57:55,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:56,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:00,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 12:58:00,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 12:58:03,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:05,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=371940.0, ans=0.125 2023-09-29 12:58:07,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 12:58:07,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:07,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=371940.0, ans=0.5 2023-09-29 12:58:08,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:10,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:10,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:10,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:13,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:13,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:15,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:58:15,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:58:16,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:58:20,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:20,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:58:21,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:23,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:23,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:58:27,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:27,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:58:27,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:29,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 12:58:36,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:38,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:38,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:39,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:39,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:41,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:41,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=372073.3333333333, ans=0.125 2023-09-29 12:58:43,261 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.976e+02 2.292e+02 2.609e+02 3.713e+02, threshold=4.584e+02, percent-clipped=0.0 2023-09-29 12:58:44,822 INFO [train.py:1039] (2/4) Epoch 11, batch 2700, loss[loss=0.3013, simple_loss=0.3459, pruned_loss=0.1283, over 19613.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2745, pruned_loss=0.06725, over 4704054.12 frames. ], batch size: 388, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:58:44,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:58:44,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 12:58:46,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:58:48,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 12:58:49,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:49,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:49,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:50,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=372140.0, ans=0.125 2023-09-29 12:58:51,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:58:51,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:51,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:58:51,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:58:51,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_na.min_abs, batch_count=372140.0, ans=0.02 2023-09-29 12:58:53,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 12:58:54,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:58:56,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:57,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:58:59,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:59:02,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:59:04,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 12:59:04,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:07,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:59:07,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:14,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:59:14,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:59:14,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:59:14,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:59:19,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:21,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:21,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:59:21,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:59:25,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:27,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:59:28,546 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.00 vs. limit=15.0 2023-09-29 12:59:37,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:59:38,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:59:41,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:59:41,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:59:45,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:45,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:47,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:50,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:59:52,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:52,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:59:55,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:55,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:56,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:59,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 13:00:01,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:03,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:00:03,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 13:00:05,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 13:00:06,670 INFO [train.py:1039] (2/4) Epoch 11, batch 2750, loss[loss=0.202, simple_loss=0.2708, pruned_loss=0.06659, over 23729.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.274, pruned_loss=0.06705, over 4705912.58 frames. ], batch size: 149, lr: 9.43e-03, grad_scale: 8.0 2023-09-29 13:00:06,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:10,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:10,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:13,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:13,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:00:13,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:17,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:00:17,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:00:17,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 13:00:19,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:00:19,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:24,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 13:00:27,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:00:27,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:27,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:00:29,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:00:29,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:29,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=372540.0, ans=0.125 2023-09-29 13:00:31,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:00:32,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:32,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:37,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:00:37,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:00:37,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:00:39,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:41,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:00:46,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=372606.6666666667, ans=0.0 2023-09-29 13:00:47,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:49,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:00:50,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:57,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:57,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:00:57,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:01:02,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:01:03,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:01:03,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 13:01:08,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:09,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 13:01:15,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:01:18,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:01:19,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 13:01:20,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:01:23,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:01:23,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 13:01:24,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:01:25,972 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.991e+02 2.210e+02 2.554e+02 4.000e+02, threshold=4.420e+02, percent-clipped=0.0 2023-09-29 13:01:27,653 INFO [train.py:1039] (2/4) Epoch 11, batch 2800, loss[loss=0.2147, simple_loss=0.2746, pruned_loss=0.07743, over 23436.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.272, pruned_loss=0.06644, over 4682291.92 frames. ], batch size: 134, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:01:27,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:01:27,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:29,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:01:29,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 13:01:29,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:30,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:32,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:33,636 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 13:01:33,637 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 13:01:36,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:38,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:01:39,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:01:42,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:01:42,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=372873.3333333333, ans=0.125 2023-09-29 13:01:44,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 13:01:47,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:01:49,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 13:01:51,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:52,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:01:52,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:01:56,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:01:56,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:56,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:01:57,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:02:02,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=372940.0, ans=0.1 2023-09-29 13:02:04,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:02:07,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:10,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:12,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:02:13,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:18,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:18,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 13:02:19,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:20,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:20,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:02:25,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:25,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:30,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:32,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:02:33,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:33,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:02:33,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:02:34,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:02:35,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:02:35,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 13:02:36,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:02:38,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 13:02:40,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:40,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:02:40,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:02:41,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 13:02:42,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=373073.3333333333, ans=0.0 2023-09-29 13:02:45,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=373073.3333333333, ans=0.0 2023-09-29 13:02:49,515 INFO [train.py:1039] (2/4) Epoch 11, batch 2850, loss[loss=0.1953, simple_loss=0.2622, pruned_loss=0.0642, over 23163.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2713, pruned_loss=0.06653, over 4685040.37 frames. ], batch size: 105, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:02:49,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:49,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:02:51,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:02:53,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:02:58,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:02:58,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:58,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:03:01,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:01,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:03:03,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:03:03,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 13:03:10,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 13:03:10,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:11,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 13:03:13,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:14,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 13:03:14,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 13:03:15,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=373206.6666666667, ans=0.125 2023-09-29 13:03:16,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:29,459 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.74 vs. limit=10.0 2023-09-29 13:03:30,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:32,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:32,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:03:33,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:03:33,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:03:35,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:03:36,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:03:41,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 13:03:42,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=373340.0, ans=0.0 2023-09-29 13:03:43,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:03:44,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:03:44,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:46,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:47,065 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.65 vs. limit=6.0 2023-09-29 13:03:48,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:48,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:50,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:53,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:55,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:03:55,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:55,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=373340.0, ans=0.0 2023-09-29 13:03:56,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:58,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:04:03,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=373406.6666666667, ans=0.0 2023-09-29 13:04:04,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:04:06,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 13:04:06,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 13:04:08,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=373406.6666666667, ans=0.1 2023-09-29 13:04:10,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:04:10,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:10,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 13:04:10,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:04:11,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:13,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:13,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:04:13,410 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 13:04:13,475 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 13:04:14,714 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.974e+02 2.166e+02 2.699e+02 4.540e+02, threshold=4.331e+02, percent-clipped=1.0 2023-09-29 13:04:14,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:14,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:16,395 INFO [train.py:1039] (2/4) Epoch 11, batch 2900, loss[loss=0.2119, simple_loss=0.2894, pruned_loss=0.06724, over 24039.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.271, pruned_loss=0.06586, over 4695790.40 frames. ], batch size: 80, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:04:18,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:18,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:18,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:04:18,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=373473.3333333333, ans=0.125 2023-09-29 13:04:19,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 13:04:25,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:26,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 13:04:26,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 13:04:26,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=373473.3333333333, ans=0.1 2023-09-29 13:04:28,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:04:28,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:04:30,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:31,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:04:35,799 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.30 vs. limit=15.0 2023-09-29 13:04:36,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:36,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:38,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=373540.0, ans=15.0 2023-09-29 13:04:40,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:04:40,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 13:04:40,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=373540.0, ans=0.5 2023-09-29 13:04:41,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:04:43,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:45,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 13:04:45,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 13:04:48,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:48,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 13:04:48,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:04:49,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:04:49,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:52,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:53,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:54,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=373606.6666666667, ans=0.0 2023-09-29 13:04:58,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:05:01,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:04,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 13:05:04,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 13:05:04,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:05:07,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:05:10,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 13:05:13,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:05:19,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:05:26,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=373740.0, ans=0.1 2023-09-29 13:05:27,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:05:28,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:05:30,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 13:05:34,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:34,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 13:05:34,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:35,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:05:39,101 INFO [train.py:1039] (2/4) Epoch 11, batch 2950, loss[loss=0.1728, simple_loss=0.25, pruned_loss=0.04778, over 24625.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2715, pruned_loss=0.06589, over 4694416.49 frames. ], batch size: 60, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:05:39,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=373806.6666666667, ans=0.2 2023-09-29 13:05:43,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:45,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 13:05:45,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:05:45,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:46,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:05:48,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:05:48,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 13:05:50,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 13:05:50,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=373806.6666666667, ans=0.015 2023-09-29 13:05:52,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:05:52,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:06:00,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:01,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:04,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:06:04,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:06,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=373873.3333333333, ans=0.0 2023-09-29 13:06:08,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:08,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:06:11,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:11,547 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:06:12,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:12,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:06:14,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 13:06:20,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 13:06:20,916 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 13:06:22,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:06:24,007 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 13:06:24,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 13:06:25,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:26,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.53 vs. limit=15.0 2023-09-29 13:06:27,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:27,045 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 13:06:27,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:06:29,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 13:06:30,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=374006.6666666667, ans=0.125 2023-09-29 13:06:31,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:32,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:06:34,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:34,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:06:36,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:36,089 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 13:06:36,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:37,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 13:06:45,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:45,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:06:47,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 13:06:47,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:06:48,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 13:06:50,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:06:52,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:53,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:06:55,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:55,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:06:58,228 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 2.184e+02 2.499e+02 3.162e+02 5.312e+02, threshold=4.998e+02, percent-clipped=4.0 2023-09-29 13:06:58,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:06:59,740 INFO [train.py:1039] (2/4) Epoch 11, batch 3000, loss[loss=0.2052, simple_loss=0.2859, pruned_loss=0.06223, over 24331.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2731, pruned_loss=0.06636, over 4703758.48 frames. ], batch size: 74, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:06:59,740 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 13:07:13,669 INFO [train.py:1071] (2/4) Epoch 11, validation: loss=0.3146, simple_loss=0.2865, pruned_loss=0.1713, over 1125622.00 frames. 2023-09-29 13:07:13,670 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 13:07:13,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:13,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:07:13,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:07:15,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:07:16,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:07:18,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:18,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 13:07:20,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:23,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:07:23,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:07:28,271 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 13:07:28,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 13:07:29,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:07:31,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:07:31,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 13:07:31,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:36,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:07:46,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:07:52,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 13:07:52,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:07:54,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:07:55,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:55,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:07:57,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:07:57,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 13:07:59,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=374273.3333333333, ans=0.125 2023-09-29 13:08:00,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 13:08:00,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=374273.3333333333, ans=10.0 2023-09-29 13:08:02,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:08:02,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:08:04,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:08:04,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:07,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:07,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:08:10,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:08:10,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:08:10,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:08:13,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:17,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 13:08:18,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:08:18,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:18,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:08:23,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:23,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:26,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:08:26,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 13:08:28,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:08:28,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 13:08:28,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:08:30,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 13:08:34,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:08:35,956 INFO [train.py:1039] (2/4) Epoch 11, batch 3050, loss[loss=0.187, simple_loss=0.257, pruned_loss=0.05853, over 20292.00 frames. ], tot_loss[loss=0.2038, simple_loss=0.2737, pruned_loss=0.06699, over 4697239.69 frames. ], batch size: 44, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:08:36,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:08:36,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 13:08:36,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=374473.3333333333, ans=0.0 2023-09-29 13:08:37,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 13:08:37,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:08:39,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:08:39,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:39,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:08:39,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:39,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:08:44,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 13:08:45,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:08:47,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:08:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:08:52,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:52,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=374540.0, ans=0.125 2023-09-29 13:08:56,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 13:08:57,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=374540.0, ans=0.2 2023-09-29 13:09:02,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 13:09:02,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 13:09:02,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:05,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.68 vs. limit=15.0 2023-09-29 13:09:06,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:09:09,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:09,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:09,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=374606.6666666667, ans=0.125 2023-09-29 13:09:10,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:12,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:12,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:09:13,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.54 vs. limit=15.0 2023-09-29 13:09:14,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:14,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:14,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:14,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=374606.6666666667, ans=0.04949747468305833 2023-09-29 13:09:15,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:18,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:22,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:22,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 13:09:23,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=374606.6666666667, ans=15.0 2023-09-29 13:09:24,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:24,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:09:27,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:09:29,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:09:29,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:09:29,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:34,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:34,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:42,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:42,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:09:42,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:45,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:45,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:09:47,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:47,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 13:09:49,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:49,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:49,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=374740.0, ans=0.125 2023-09-29 13:09:50,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 13:09:52,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:52,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.60 vs. limit=15.0 2023-09-29 13:09:57,386 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.035e+02 2.257e+02 2.557e+02 3.814e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-29 13:09:58,902 INFO [train.py:1039] (2/4) Epoch 11, batch 3100, loss[loss=0.1715, simple_loss=0.2448, pruned_loss=0.04905, over 24309.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2726, pruned_loss=0.06663, over 4708910.85 frames. ], batch size: 61, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:09:59,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:00,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:10:04,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:10:04,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 13:10:07,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 13:10:07,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 13:10:07,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=374806.6666666667, ans=0.125 2023-09-29 13:10:10,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:10:13,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:10:14,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:15,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:10:16,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.69 vs. limit=6.0 2023-09-29 13:10:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:26,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 13:10:30,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:10:31,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:31,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:10:31,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:10:33,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:10:37,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:10:37,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 13:10:37,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:10:37,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=374940.0, ans=0.125 2023-09-29 13:10:38,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:41,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 13:10:43,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:10:46,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:10:46,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=375006.6666666667, ans=0.0 2023-09-29 13:10:48,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 13:10:48,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 13:10:49,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:51,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:54,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:10:54,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:54,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:10:55,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:10:55,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:57,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:10:57,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:10:57,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:57,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:10:57,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=375006.6666666667, ans=0.1 2023-09-29 13:11:00,349 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.54 vs. limit=12.0 2023-09-29 13:11:01,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:11:03,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 13:11:06,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:11:06,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 13:11:07,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:07,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:07,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 13:11:18,398 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:11:19,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 13:11:21,152 INFO [train.py:1039] (2/4) Epoch 11, batch 3150, loss[loss=0.1947, simple_loss=0.2584, pruned_loss=0.06546, over 21311.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2707, pruned_loss=0.06553, over 4719385.58 frames. ], batch size: 47, lr: 9.40e-03, grad_scale: 16.0 2023-09-29 13:11:22,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:22,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:25,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:11:25,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:11:27,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 13:11:28,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:28,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:11:30,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 13:11:31,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:33,952 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 13:11:37,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 13:11:37,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:11:40,259 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 13:11:40,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:11:41,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 13:11:42,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=375206.6666666667, ans=0.0 2023-09-29 13:11:43,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 13:11:43,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 13:11:43,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:43,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:11:44,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:47,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 13:11:49,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:50,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:52,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:11:53,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:11:56,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 13:11:56,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:11:57,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=375273.3333333333, ans=0.125 2023-09-29 13:11:58,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:12:00,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:12:00,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 13:12:01,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 13:12:03,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:12:03,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:12:03,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:12:04,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:04,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:12:07,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:12:08,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:12:08,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 13:12:10,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:12:10,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:13,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:12:13,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:12:13,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 13:12:15,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:16,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 13:12:16,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:18,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 13:12:19,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 13:12:21,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:12:21,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:21,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=375340.0, ans=0.125 2023-09-29 13:12:23,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 13:12:24,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 13:12:26,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:29,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:12:31,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:31,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:12:33,436 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.26 vs. limit=6.0 2023-09-29 13:12:37,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:12:37,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:39,896 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.896e+02 2.240e+02 2.702e+02 3.896e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 13:12:40,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 13:12:40,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.79 vs. limit=15.0 2023-09-29 13:12:41,470 INFO [train.py:1039] (2/4) Epoch 11, batch 3200, loss[loss=0.2094, simple_loss=0.272, pruned_loss=0.07339, over 23566.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2703, pruned_loss=0.06484, over 4716503.27 frames. ], batch size: 149, lr: 9.40e-03, grad_scale: 32.0 2023-09-29 13:12:45,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:12:45,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 13:12:50,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:50,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:12:50,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 13:12:53,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:55,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=375473.3333333333, ans=0.0 2023-09-29 13:13:00,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:13:00,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=375540.0, ans=0.1 2023-09-29 13:13:03,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:13:11,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:13:21,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 13:13:23,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:13:25,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 13:13:26,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:13:29,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:13:30,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:13:30,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:13:33,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=375673.3333333333, ans=0.1 2023-09-29 13:13:34,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 13:13:35,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:13:38,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 13:13:41,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 13:13:43,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:13:48,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:49,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:13:50,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:50,469 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 13:13:50,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:13:56,082 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.15 vs. limit=15.0 2023-09-29 13:13:57,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:13:57,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 13:13:57,918 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=15.0 2023-09-29 13:13:58,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 13:14:00,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 13:14:01,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 13:14:04,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:14:06,015 INFO [train.py:1039] (2/4) Epoch 11, batch 3250, loss[loss=0.213, simple_loss=0.2875, pruned_loss=0.0692, over 23994.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2704, pruned_loss=0.065, over 4713712.90 frames. ], batch size: 80, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:14:06,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:14:07,583 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 13:14:07,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:07,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:10,573 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 13:14:12,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=375806.6666666667, ans=0.0 2023-09-29 13:14:15,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:14:15,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=375806.6666666667, ans=0.125 2023-09-29 13:14:15,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=375806.6666666667, ans=0.125 2023-09-29 13:14:18,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:18,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=375806.6666666667, ans=0.125 2023-09-29 13:14:26,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:14:26,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 13:14:26,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:26,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:14:26,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:28,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:28,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:14:31,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:31,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:14:32,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:32,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=375873.3333333333, ans=0.0 2023-09-29 13:14:33,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:33,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:33,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:14:37,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:37,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:39,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:40,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:42,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:42,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:42,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:14:47,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 13:14:47,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=375940.0, ans=0.125 2023-09-29 13:14:48,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:48,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:14:50,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:50,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:14:57,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:14:57,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=376006.6666666667, ans=0.125 2023-09-29 13:15:03,232 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.23 vs. limit=15.0 2023-09-29 13:15:05,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:06,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:06,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 13:15:06,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:15:06,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:15:06,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:10,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 13:15:10,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 13:15:12,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:15:12,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:13,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:13,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:15:15,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:18,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:18,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:21,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 13:15:21,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:24,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:15:24,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 13:15:26,168 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.864e+02 2.159e+02 2.577e+02 4.318e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-29 13:15:27,695 INFO [train.py:1039] (2/4) Epoch 11, batch 3300, loss[loss=0.1968, simple_loss=0.2742, pruned_loss=0.05972, over 24640.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2709, pruned_loss=0.06514, over 4724507.36 frames. ], batch size: 65, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:15:27,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:27,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 13:15:29,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 13:15:30,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 13:15:30,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:34,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:36,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:15:36,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:38,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:15:38,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:15:42,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:42,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=376140.0, ans=0.0 2023-09-29 13:15:42,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=376140.0, ans=0.125 2023-09-29 13:15:43,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:46,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-09-29 13:15:48,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 13:15:48,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:15:48,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:50,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:51,685 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 13:15:53,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:15:53,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:15:55,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:15:55,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:15:55,400 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 13:15:58,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:58,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:16:01,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:01,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 13:16:03,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 13:16:03,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:05,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:16:06,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=376273.3333333333, ans=0.125 2023-09-29 13:16:08,183 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 13:16:09,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 13:16:09,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:16:12,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 13:16:14,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:19,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:16:19,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:21,677 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.18 vs. limit=15.0 2023-09-29 13:16:22,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:22,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:22,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:16:22,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:16:24,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:16:24,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:25,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:16:26,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=376340.0, ans=0.125 2023-09-29 13:16:27,900 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 13:16:28,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=376340.0, ans=0.1 2023-09-29 13:16:29,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 13:16:30,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:16:30,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:16:30,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:33,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:33,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:36,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:16:37,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:37,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:16:38,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:40,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:16:42,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=376406.6666666667, ans=0.1 2023-09-29 13:16:43,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 13:16:43,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:43,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=376406.6666666667, ans=0.025 2023-09-29 13:16:46,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:46,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:16:48,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:50,007 INFO [train.py:1039] (2/4) Epoch 11, batch 3350, loss[loss=0.2064, simple_loss=0.2766, pruned_loss=0.06811, over 23359.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2723, pruned_loss=0.06589, over 4726504.14 frames. ], batch size: 93, lr: 9.38e-03, grad_scale: 32.0 2023-09-29 13:16:50,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:52,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:52,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:52,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=376473.3333333333, ans=0.2 2023-09-29 13:16:54,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:55,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:55,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=376473.3333333333, ans=0.125 2023-09-29 13:16:58,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:17:02,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:05,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:17:05,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:06,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:17:08,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 13:17:08,301 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 13:17:08,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:12,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 13:17:12,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=376540.0, ans=0.0 2023-09-29 13:17:13,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 13:17:14,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:17:14,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:17:16,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:16,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 13:17:16,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:17,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:17:19,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:22,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:22,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:24,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:17:28,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:31,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:32,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:37,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:17:37,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:40,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:40,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:41,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:44,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 13:17:44,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:17:44,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 13:17:45,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:17:47,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 13:17:48,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:50,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:57,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:57,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 13:17:59,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:01,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:18:01,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:18:01,963 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.44 vs. limit=15.0 2023-09-29 13:18:05,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:08,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 13:18:08,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:18:09,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:18:10,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:10,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 13:18:11,970 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.933e+02 2.082e+02 2.375e+02 4.063e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 13:18:12,014 INFO [train.py:1039] (2/4) Epoch 11, batch 3400, loss[loss=0.2045, simple_loss=0.2663, pruned_loss=0.07136, over 23653.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2732, pruned_loss=0.06664, over 4708058.66 frames. ], batch size: 232, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:18:12,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:18:12,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 13:18:13,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:18:17,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:18:17,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 13:18:22,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 13:18:22,649 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 13:18:22,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:18:27,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:27,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:27,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:28,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:18:35,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:18:37,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 13:18:40,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:18:42,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:42,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:43,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:18:50,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:18:50,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=376940.0, ans=0.125 2023-09-29 13:18:50,707 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:18:55,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 13:19:03,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:05,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:05,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 13:19:05,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:07,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:07,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:19:07,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:19:12,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:19:15,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:19:15,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:19:21,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:24,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 13:19:30,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:19:35,254 INFO [train.py:1039] (2/4) Epoch 11, batch 3450, loss[loss=0.207, simple_loss=0.2838, pruned_loss=0.06511, over 23989.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2729, pruned_loss=0.06663, over 4683781.36 frames. ], batch size: 80, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:19:35,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 13:19:38,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 13:19:40,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:41,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:19:41,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 13:19:43,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:49,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:19:53,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:19:54,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:19:55,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:19:55,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:57,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:06,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 13:20:10,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 13:20:10,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:20:12,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:20:13,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:19,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 13:20:21,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:20:26,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:27,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:20:29,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:20:29,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:20:30,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.02 vs. limit=12.0 2023-09-29 13:20:31,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 13:20:31,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:32,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:36,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:20:37,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 13:20:41,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:20:44,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:20:47,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:51,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:20:56,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:56,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:58,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:20:58,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:58,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=377473.3333333333, ans=0.125 2023-09-29 13:20:59,828 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.022e+02 2.344e+02 2.789e+02 4.683e+02, threshold=4.688e+02, percent-clipped=2.0 2023-09-29 13:20:59,870 INFO [train.py:1039] (2/4) Epoch 11, batch 3500, loss[loss=0.2003, simple_loss=0.2676, pruned_loss=0.06651, over 23662.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2714, pruned_loss=0.06621, over 4683172.09 frames. ], batch size: 149, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:21:00,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:04,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:21:04,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 13:21:08,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:21:11,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:21:12,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:12,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 13:21:20,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:21:20,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:21:22,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:21:22,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:23,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:21:23,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:23,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:24,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 13:21:27,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:27,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:21:29,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:33,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:35,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 13:21:35,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:35,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=377606.6666666667, ans=0.0 2023-09-29 13:21:38,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:40,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:21:42,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:45,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:21:45,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:46,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 13:21:46,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 13:21:48,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 13:21:48,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:48,904 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.60 vs. limit=10.0 2023-09-29 13:21:50,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:50,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:51,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:21:55,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:21:55,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:22:02,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:03,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 13:22:03,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 13:22:03,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:05,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:06,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:08,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:11,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 13:22:11,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:14,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:22:16,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 13:22:17,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 13:22:18,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=377740.0, ans=0.2 2023-09-29 13:22:19,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:19,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:21,005 INFO [train.py:1039] (2/4) Epoch 11, batch 3550, loss[loss=0.1826, simple_loss=0.2694, pruned_loss=0.04786, over 24417.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.27, pruned_loss=0.06544, over 4693160.30 frames. ], batch size: 69, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:22:21,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:21,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:23,603 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.47 vs. limit=15.0 2023-09-29 13:22:24,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:22:35,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:36,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 13:22:39,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:41,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:22:41,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=377873.3333333333, ans=0.125 2023-09-29 13:22:42,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:44,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:22:44,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:22:47,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:47,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:22:47,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:47,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:22:49,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:22:55,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:22:55,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:58,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:22:58,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:00,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:23:00,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 13:23:00,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:01,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:03,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:23:10,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:12,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:23:13,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:13,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 13:23:15,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:23:18,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 13:23:19,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:23:21,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:23:21,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:23:25,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 13:23:27,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:30,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=378073.3333333333, ans=0.0 2023-09-29 13:23:31,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:33,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 13:23:33,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:38,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:39,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 13:23:43,831 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.951e+02 2.213e+02 2.629e+02 3.694e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 13:23:43,874 INFO [train.py:1039] (2/4) Epoch 11, batch 3600, loss[loss=0.2074, simple_loss=0.2895, pruned_loss=0.0626, over 24631.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2692, pruned_loss=0.06489, over 4678086.55 frames. ], batch size: 68, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:23:45,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 13:23:45,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:23:46,704 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.04 vs. limit=10.0 2023-09-29 13:23:47,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:23:49,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:49,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:49,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=378140.0, ans=0.0 2023-09-29 13:23:50,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:23:53,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:55,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:57,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:23:57,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:23:59,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:59,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 13:24:03,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:24:05,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:08,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:09,404 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.62 vs. limit=15.0 2023-09-29 13:24:10,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=378206.6666666667, ans=0.125 2023-09-29 13:24:10,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=378206.6666666667, ans=0.0 2023-09-29 13:24:12,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:13,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:24:13,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:24:13,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 13:24:15,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:18,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:21,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:24:22,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:24,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:25,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:24:25,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 13:24:27,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=378273.3333333333, ans=0.125 2023-09-29 13:24:35,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:24:36,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:24:37,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 13:24:41,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:24:45,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:47,889 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.00 vs. limit=22.5 2023-09-29 13:24:48,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:56,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:24:56,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:24:56,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 13:24:57,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 13:24:59,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 13:25:02,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:25:02,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:25:03,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 13:25:03,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:03,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:25:03,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:04,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 13:25:06,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 13:25:07,545 INFO [train.py:1039] (2/4) Epoch 11, batch 3650, loss[loss=0.1857, simple_loss=0.2606, pruned_loss=0.05537, over 24615.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.27, pruned_loss=0.06539, over 4687999.81 frames. ], batch size: 60, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:25:07,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:25:08,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 13:25:14,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 13:25:16,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:25:20,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 13:25:23,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 13:25:24,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=378540.0, ans=0.125 2023-09-29 13:25:27,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:25:27,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:25:27,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:25:31,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:25:32,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:32,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 13:25:32,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:25:33,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=378540.0, ans=0.2 2023-09-29 13:25:34,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:34,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 13:25:36,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:25:37,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:25:37,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:37,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:25:39,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 13:25:41,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 13:25:43,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:25:44,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 13:25:46,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:25:46,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:25:54,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:25:56,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:56,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:25:57,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:25:59,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:26:02,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:26:04,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:06,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:06,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:26:06,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:26:08,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:26:09,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:15,142 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 13:26:18,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=378740.0, ans=0.0 2023-09-29 13:26:19,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:19,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:20,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:26:21,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:21,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:26:23,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:24,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 13:26:24,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:26,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:26:29,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:30,852 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.035e+02 2.257e+02 2.600e+02 3.794e+02, threshold=4.515e+02, percent-clipped=0.0 2023-09-29 13:26:30,897 INFO [train.py:1039] (2/4) Epoch 11, batch 3700, loss[loss=0.1908, simple_loss=0.2637, pruned_loss=0.05899, over 24673.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2719, pruned_loss=0.06642, over 4693855.52 frames. ], batch size: 65, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:26:31,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:26:34,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:34,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 13:26:34,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:36,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:26:36,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:26:39,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:26:41,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:42,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:42,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:26:44,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:44,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:26:46,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:48,272 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 13:26:51,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=378873.3333333333, ans=0.125 2023-09-29 13:26:53,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=378873.3333333333, ans=10.0 2023-09-29 13:26:57,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:26:57,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:26:57,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=378873.3333333333, ans=0.1 2023-09-29 13:26:59,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:26:59,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 13:26:59,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:01,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=378873.3333333333, ans=0.0 2023-09-29 13:27:04,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:04,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 13:27:08,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:10,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:27:13,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:13,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:27:14,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:27:19,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:20,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 13:27:20,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:27:20,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 13:27:23,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:27:25,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:27:25,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=379006.6666666667, ans=0.2 2023-09-29 13:27:28,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:28,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 13:27:28,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=379006.6666666667, ans=0.0 2023-09-29 13:27:31,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:27:31,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:27:31,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:31,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:35,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:36,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 13:27:38,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 13:27:38,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:27:38,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:27:40,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:27:41,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:27:45,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=379073.3333333333, ans=0.125 2023-09-29 13:27:46,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:49,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:27:52,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:27:53,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 13:27:54,005 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:27:55,182 INFO [train.py:1039] (2/4) Epoch 11, batch 3750, loss[loss=0.1841, simple_loss=0.2562, pruned_loss=0.05603, over 20866.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2728, pruned_loss=0.06681, over 4693748.12 frames. ], batch size: 45, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:27:55,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:27:56,272 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=15.0 2023-09-29 13:27:57,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:27:58,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 13:27:58,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:28:00,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:01,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:03,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:04,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=379140.0, ans=0.0 2023-09-29 13:28:06,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:09,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:28:11,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:28:13,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:28:16,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:18,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 13:28:20,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:20,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:21,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:23,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 13:28:28,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 13:28:30,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:31,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:33,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:39,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:42,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 13:28:44,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 13:28:48,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:53,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:53,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:28:58,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:29:02,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:29:03,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:29:05,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:29:07,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:29:08,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:29:16,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:29:18,120 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.004e+02 2.192e+02 2.441e+02 3.152e+02, threshold=4.385e+02, percent-clipped=0.0 2023-09-29 13:29:18,165 INFO [train.py:1039] (2/4) Epoch 11, batch 3800, loss[loss=0.1966, simple_loss=0.2676, pruned_loss=0.06273, over 24337.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2731, pruned_loss=0.06691, over 4692730.80 frames. ], batch size: 56, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:29:19,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:19,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:29:21,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 13:29:23,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:23,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:25,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:29:27,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 13:29:27,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:29,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:29:30,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:32,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:29:32,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:34,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 13:29:39,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 13:29:39,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:29:40,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:45,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:29:45,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:29:47,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:29:47,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:49,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:51,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:56,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:29:56,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 13:29:58,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:30:04,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=379606.6666666667, ans=0.125 2023-09-29 13:30:05,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:10,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:30:12,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 13:30:14,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 13:30:15,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:17,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:30:17,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:18,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 13:30:23,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 13:30:23,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 13:30:24,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:25,396 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.43 vs. limit=15.0 2023-09-29 13:30:26,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:31,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:30:33,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:30:37,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=379740.0, ans=0.125 2023-09-29 13:30:37,457 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.23 vs. limit=15.0 2023-09-29 13:30:40,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:30:40,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 13:30:42,627 INFO [train.py:1039] (2/4) Epoch 11, batch 3850, loss[loss=0.1917, simple_loss=0.2665, pruned_loss=0.05845, over 24308.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2717, pruned_loss=0.06659, over 4708999.32 frames. ], batch size: 61, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:30:42,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:30:42,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:46,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=379806.6666666667, ans=0.125 2023-09-29 13:30:48,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:30:51,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:53,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=379806.6666666667, ans=0.0 2023-09-29 13:30:54,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:30:56,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 13:30:57,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=379873.3333333333, ans=15.0 2023-09-29 13:30:59,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=379873.3333333333, ans=0.09899494936611666 2023-09-29 13:31:01,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:05,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:31:06,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:06,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:31:10,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:12,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:31:12,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:12,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:31:13,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:16,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:18,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:18,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:31:18,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 13:31:18,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 13:31:19,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:19,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:21,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:21,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=379940.0, ans=0.025 2023-09-29 13:31:23,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:23,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 13:31:25,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 13:31:28,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:30,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 13:31:32,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=380006.6666666667, ans=0.125 2023-09-29 13:31:33,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:31:40,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:40,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:45,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:45,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 13:31:49,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 13:31:50,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:50,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:55,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:31:55,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:31:55,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:31:56,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 13:31:58,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:32:01,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 13:32:01,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:01,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:04,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:32:05,646 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.057e+02 2.295e+02 2.790e+02 3.822e+02, threshold=4.589e+02, percent-clipped=0.0 2023-09-29 13:32:05,690 INFO [train.py:1039] (2/4) Epoch 11, batch 3900, loss[loss=0.2075, simple_loss=0.2664, pruned_loss=0.0743, over 22669.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2704, pruned_loss=0.06613, over 4704890.72 frames. ], batch size: 322, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:32:05,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:07,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:32:07,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:07,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:32:09,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:10,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 13:32:10,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:15,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:16,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:16,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:32:16,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:20,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:20,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:23,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:32:23,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 13:32:25,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:27,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 13:32:27,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:29,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 13:32:29,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 13:32:35,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:37,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:37,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:32:37,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:32:40,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:41,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=380273.3333333333, ans=0.125 2023-09-29 13:32:42,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:32:44,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:32:44,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:32:45,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:32:51,547 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.97 vs. limit=15.0 2023-09-29 13:32:52,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:52,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:32:58,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=380340.0, ans=0.0 2023-09-29 13:33:02,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:33:04,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:33:13,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:17,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:17,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 13:33:17,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 13:33:17,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:20,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 13:33:21,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:33:23,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 13:33:28,104 INFO [train.py:1039] (2/4) Epoch 11, batch 3950, loss[loss=0.2051, simple_loss=0.2774, pruned_loss=0.06636, over 23346.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2706, pruned_loss=0.06609, over 4699510.93 frames. ], batch size: 93, lr: 9.34e-03, grad_scale: 16.0 2023-09-29 13:33:29,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:33:31,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 13:33:31,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:33:34,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:33:37,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:33:45,411 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 13:33:45,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:45,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 13:33:47,099 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 13:33:48,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:51,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:51,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:33:51,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:54,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 13:33:57,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:33:59,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:59,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:34:00,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:34:00,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:34:12,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:34:12,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=380606.6666666667, ans=0.125 2023-09-29 13:34:14,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:34:19,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 13:34:20,539 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.98 vs. limit=12.0 2023-09-29 13:34:25,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 13:34:25,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 13:34:25,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:34:27,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:34:32,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=380740.0, ans=0.5 2023-09-29 13:34:36,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:34:36,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:34:36,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:34:36,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:34:36,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 13:34:42,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:34:43,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:34:47,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 13:34:50,670 INFO [train.py:1039] (2/4) Epoch 11, batch 4000, loss[loss=0.18, simple_loss=0.2479, pruned_loss=0.05604, over 17313.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2706, pruned_loss=0.06563, over 4704238.15 frames. ], batch size: 37, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:34:52,648 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.905e+02 2.140e+02 2.457e+02 3.925e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 13:34:58,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:00,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=380806.6666666667, ans=0.1 2023-09-29 13:35:05,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:08,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=380873.3333333333, ans=0.125 2023-09-29 13:35:10,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:10,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:35:10,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:10,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 13:35:10,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=380873.3333333333, ans=0.0 2023-09-29 13:35:12,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:35:12,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 13:35:12,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:35:12,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 13:35:14,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:18,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:35:18,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:35:18,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:35:18,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:18,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:35:20,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:35:22,430 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 13:35:22,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:35:22,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:26,413 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 13:35:26,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:35:26,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:30,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.97 vs. limit=15.0 2023-09-29 13:35:34,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 13:35:34,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:36,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=380940.0, ans=0.125 2023-09-29 13:35:37,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:35:37,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=380940.0, ans=0.125 2023-09-29 13:35:38,857 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 13:35:40,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:35:40,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 13:35:40,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:35:42,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:42,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:35:45,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:35:45,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:35:46,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:47,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 13:35:48,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:50,580 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 13:35:54,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:35:54,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=381006.6666666667, ans=0.125 2023-09-29 13:35:57,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:35:59,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:35:59,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:36:00,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:36:02,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:06,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:36:09,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:36:11,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 13:36:11,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=381140.0, ans=0.125 2023-09-29 13:36:13,377 INFO [train.py:1039] (2/4) Epoch 11, batch 4050, loss[loss=0.2029, simple_loss=0.2806, pruned_loss=0.06255, over 24045.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2713, pruned_loss=0.0659, over 4704025.88 frames. ], batch size: 80, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:36:13,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:36:13,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:14,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:36:16,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:19,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:22,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:25,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:36:25,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=381140.0, ans=0.1 2023-09-29 13:36:26,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:36:28,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:36:29,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:36:32,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:35,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:38,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 13:36:40,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 13:36:40,122 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 13:36:41,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:36:51,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 13:36:52,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:36:54,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=381273.3333333333, ans=0.2 2023-09-29 13:36:56,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:59,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:59,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:36:59,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:37:03,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:37:07,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 13:37:07,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:37:09,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:11,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=381340.0, ans=0.125 2023-09-29 13:37:12,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 13:37:16,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:26,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 13:37:26,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:37:26,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:37:29,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 13:37:29,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 13:37:29,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:30,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=381406.6666666667, ans=0.05 2023-09-29 13:37:32,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:37:34,404 INFO [train.py:1039] (2/4) Epoch 11, batch 4100, loss[loss=0.176, simple_loss=0.2433, pruned_loss=0.0543, over 24291.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2723, pruned_loss=0.06608, over 4705558.70 frames. ], batch size: 56, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:37:34,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:34,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:37:35,984 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.063e+02 2.315e+02 3.202e+02 5.550e+02, threshold=4.630e+02, percent-clipped=7.0 2023-09-29 13:37:42,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 13:37:43,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 13:37:45,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 13:37:46,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 13:37:46,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:48,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:37:49,937 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 13:37:52,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:37:53,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=381540.0, ans=0.125 2023-09-29 13:37:54,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:37:54,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:55,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:37:59,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:38:01,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:38:01,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:38:01,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 13:38:02,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:02,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:38:02,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:02,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:38:04,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 13:38:04,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=381540.0, ans=0.1 2023-09-29 13:38:08,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:08,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 13:38:10,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:38:14,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:14,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 13:38:16,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:38:17,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:38:17,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:38:19,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 13:38:21,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:38:21,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:38:24,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 13:38:24,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:24,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:38:27,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:34,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:38:37,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:39,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:38:43,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=381740.0, ans=0.0 2023-09-29 13:38:47,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:38:47,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:50,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:53,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:38:57,282 INFO [train.py:1039] (2/4) Epoch 11, batch 4150, loss[loss=0.2021, simple_loss=0.277, pruned_loss=0.06359, over 23308.00 frames. ], tot_loss[loss=0.2027, simple_loss=0.2729, pruned_loss=0.0662, over 4706321.03 frames. ], batch size: 93, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:38:58,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:39:00,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:39:01,460 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.18 vs. limit=12.0 2023-09-29 13:39:02,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:39:02,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:05,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 13:39:06,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:06,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 13:39:06,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 13:39:06,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 13:39:08,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:10,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=381806.6666666667, ans=0.1 2023-09-29 13:39:14,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:39:14,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:18,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:19,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:39:21,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:39:24,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:39:24,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:25,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:39:30,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:31,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=381940.0, ans=0.125 2023-09-29 13:39:33,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=381940.0, ans=0.125 2023-09-29 13:39:34,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:34,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 13:39:38,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 13:39:38,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:39:39,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 13:39:39,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:39:39,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:39:44,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:39:44,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:48,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 13:39:51,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:39:51,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:39:52,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 13:39:54,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:56,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 13:39:57,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.79 vs. limit=15.0 2023-09-29 13:39:59,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:40:00,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:40:02,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:03,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 13:40:03,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:03,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:40:04,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:40:04,850 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.72 vs. limit=12.0 2023-09-29 13:40:07,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 13:40:07,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:07,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:40:08,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:40:08,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 13:40:08,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:40:08,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:40:10,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:40:12,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:12,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 13:40:14,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:40:18,766 INFO [train.py:1039] (2/4) Epoch 11, batch 4200, loss[loss=0.2175, simple_loss=0.2832, pruned_loss=0.07594, over 23295.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2714, pruned_loss=0.06595, over 4703527.62 frames. ], batch size: 93, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:40:20,254 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.930e+02 2.193e+02 2.587e+02 4.330e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-29 13:40:20,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:40:21,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 13:40:24,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:40:26,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:26,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:40:28,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:28,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:29,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=382140.0, ans=0.0 2023-09-29 13:40:30,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 13:40:33,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 13:40:35,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:36,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:38,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:40:41,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:40:44,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:40:44,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:44,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 13:40:44,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:46,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:47,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:48,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:40:48,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:40:51,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 13:40:51,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:55,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:40:56,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:41:00,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:41:00,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:00,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=382273.3333333333, ans=0.0 2023-09-29 13:41:05,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:41:05,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 13:41:05,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:05,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:41:11,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:41:13,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=382340.0, ans=0.0 2023-09-29 13:41:14,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:22,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:41:24,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 13:41:27,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:30,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:41:32,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:34,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 13:41:36,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=382406.6666666667, ans=0.0 2023-09-29 13:41:38,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:41:41,372 INFO [train.py:1039] (2/4) Epoch 11, batch 4250, loss[loss=0.1878, simple_loss=0.2539, pruned_loss=0.06084, over 23709.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2684, pruned_loss=0.06561, over 4686192.22 frames. ], batch size: 232, lr: 9.31e-03, grad_scale: 32.0 2023-09-29 13:41:44,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:44,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:41:48,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:49,645 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.28 vs. limit=15.0 2023-09-29 13:41:53,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:41:53,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 13:41:55,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:57,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:01,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:01,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=382540.0, ans=0.0 2023-09-29 13:42:05,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=382540.0, ans=0.125 2023-09-29 13:42:06,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:07,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:07,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:42:07,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:10,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:10,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:10,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:14,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:42:14,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:15,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.54 vs. limit=22.5 2023-09-29 13:42:16,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 13:42:20,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 13:42:20,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:22,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:22,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:23,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:42:25,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:25,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:28,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 13:42:31,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:42:36,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:42:38,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:38,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 13:42:39,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:42:41,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 13:42:42,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:42:44,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:42:44,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:44,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:45,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.21 vs. limit=10.0 2023-09-29 13:42:48,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 13:42:49,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:42:51,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:42:54,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:56,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:57,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:42:59,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:00,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:02,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:43:03,990 INFO [train.py:1039] (2/4) Epoch 11, batch 4300, loss[loss=0.1858, simple_loss=0.2586, pruned_loss=0.05648, over 24602.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.2682, pruned_loss=0.06504, over 4694201.29 frames. ], batch size: 60, lr: 9.31e-03, grad_scale: 16.0 2023-09-29 13:43:04,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:04,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 13:43:05,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:07,086 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.989e+02 2.378e+02 2.757e+02 5.301e+02, threshold=4.756e+02, percent-clipped=4.0 2023-09-29 13:43:12,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:12,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:15,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.40 vs. limit=22.5 2023-09-29 13:43:17,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:23,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:43:23,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 13:43:25,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:43:27,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:43:27,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:43:28,814 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 13:43:33,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:43:33,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:43:35,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 13:43:37,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:43:37,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 13:43:40,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:43:42,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:43:42,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=382940.0, ans=0.1 2023-09-29 13:43:44,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:43:44,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:46,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:43:47,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:47,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:47,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 13:43:49,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 13:43:52,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:54,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:54,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:43:54,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:56,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:56,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 13:43:56,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 13:43:56,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 13:43:57,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:43:57,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 13:43:58,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=383006.6666666667, ans=0.2 2023-09-29 13:43:59,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 13:44:02,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:05,273 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 13:44:06,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:44:08,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:09,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:11,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=383073.3333333333, ans=0.1 2023-09-29 13:44:13,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 13:44:13,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:44:13,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:14,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:14,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:14,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:44:18,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:44:21,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:23,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:23,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:26,387 INFO [train.py:1039] (2/4) Epoch 11, batch 4350, loss[loss=0.2148, simple_loss=0.2925, pruned_loss=0.0686, over 24294.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2694, pruned_loss=0.06507, over 4706181.88 frames. ], batch size: 77, lr: 9.30e-03, grad_scale: 16.0 2023-09-29 13:44:30,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 13:44:31,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:44:34,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:44:37,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:40,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:44:40,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:44:45,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:44:49,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:52,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:44:52,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:54,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:44:57,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:44:59,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:44:59,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=383273.3333333333, ans=0.125 2023-09-29 13:45:03,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=383273.3333333333, ans=0.0 2023-09-29 13:45:07,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 13:45:08,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:08,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:13,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:13,980 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:45:16,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 13:45:19,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:21,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:45:26,545 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 13:45:28,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:28,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:45:29,730 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 13:45:29,835 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 13:45:29,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:29,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:31,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:45:31,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:34,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:34,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:45:37,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 13:45:37,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:37,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 13:45:39,313 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 13:45:39,320 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 13:45:40,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 13:45:43,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:45:43,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:45:43,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:45:45,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:45:45,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 13:45:47,214 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 13:45:47,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:47,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=383473.3333333333, ans=0.125 2023-09-29 13:45:48,652 INFO [train.py:1039] (2/4) Epoch 11, batch 4400, loss[loss=0.2445, simple_loss=0.2938, pruned_loss=0.09759, over 23680.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.2704, pruned_loss=0.0655, over 4699069.17 frames. ], batch size: 232, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:45:49,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=383473.3333333333, ans=0.125 2023-09-29 13:45:50,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:45:50,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:51,787 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.226e+02 2.866e+02 4.775e+02, threshold=4.452e+02, percent-clipped=1.0 2023-09-29 13:45:52,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:55,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 13:45:55,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 13:45:55,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 13:45:57,228 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 13:45:57,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:45:57,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:46:00,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 13:46:02,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:04,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:04,216 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 13:46:08,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:08,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 13:46:09,933 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 13:46:14,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 13:46:14,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 13:46:15,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 13:46:15,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:17,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:18,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:20,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:20,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 13:46:20,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 13:46:22,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:23,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:46:23,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:25,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:27,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:27,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 13:46:28,481 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 13:46:33,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:40,221 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.31 vs. limit=15.0 2023-09-29 13:46:40,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:41,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 13:46:46,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:46:48,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:46:51,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=383673.3333333333, ans=0.015 2023-09-29 13:46:52,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:46:52,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 13:46:53,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=383673.3333333333, ans=0.1 2023-09-29 13:46:54,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:46:54,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:46:54,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:46:54,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:46:59,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 13:47:02,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 13:47:03,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 13:47:03,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:03,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 13:47:03,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:47:07,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:47:09,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 13:47:12,112 INFO [train.py:1039] (2/4) Epoch 11, batch 4450, loss[loss=0.2259, simple_loss=0.2789, pruned_loss=0.08639, over 23862.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2723, pruned_loss=0.06628, over 4698093.70 frames. ], batch size: 179, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:47:12,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:47:16,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:17,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:47:22,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=383806.6666666667, ans=0.1 2023-09-29 13:47:23,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:47:24,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:47:26,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:29,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:47:31,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:47:31,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:31,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=383873.3333333333, ans=0.125 2023-09-29 13:47:32,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 13:47:32,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:34,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:34,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:47:35,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:47:37,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=383873.3333333333, ans=0.0 2023-09-29 13:47:38,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:47:45,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:45,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:48,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:49,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:49,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:47:57,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:47:57,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 13:47:58,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 13:47:58,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:48:02,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:02,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 13:48:02,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=384006.6666666667, ans=0.0 2023-09-29 13:48:06,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:48:09,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:11,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 13:48:11,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:11,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:11,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:48:12,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:13,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:16,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:48:16,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 13:48:19,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:48:21,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:48:23,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:24,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:25,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:48:27,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:48:30,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 13:48:32,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:48:36,511 INFO [train.py:1039] (2/4) Epoch 11, batch 4500, loss[loss=0.1871, simple_loss=0.2515, pruned_loss=0.06138, over 23638.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2723, pruned_loss=0.06611, over 4703849.73 frames. ], batch size: 135, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:48:38,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:39,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 13:48:39,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 13:48:41,283 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.027e+02 2.276e+02 2.770e+02 4.229e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 13:48:41,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:48:47,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:48,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:49,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:48:50,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:48:50,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:48:50,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:49:03,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:49:05,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:49:08,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:09,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:49:09,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:49:17,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:49:20,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:49:25,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:49:26,171 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.70 vs. limit=15.0 2023-09-29 13:49:28,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:49:28,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 13:49:30,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:30,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:35,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:49:37,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 13:49:37,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:49:37,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:42,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:49:42,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:49:44,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:47,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:49:47,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:49:50,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 13:49:51,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 13:49:51,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 13:49:56,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 13:49:58,089 INFO [train.py:1039] (2/4) Epoch 11, batch 4550, loss[loss=0.203, simple_loss=0.2851, pruned_loss=0.06048, over 24552.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2709, pruned_loss=0.06594, over 4705152.00 frames. ], batch size: 71, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:49:58,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 13:49:59,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:03,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:04,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:07,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:10,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:50:14,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:50:17,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:17,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:50:17,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:20,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:21,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:24,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:50:27,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 13:50:27,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 13:50:29,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:50:32,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 13:50:35,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 13:50:35,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:40,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 13:50:42,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:50:45,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:45,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:45,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:50:49,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 13:50:51,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:50:52,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:52,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:55,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:57,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 13:50:57,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 13:50:58,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:50:58,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 13:51:00,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=384673.3333333333, ans=0.125 2023-09-29 13:51:01,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 13:51:01,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:51:03,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:03,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:04,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:04,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:51:05,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:51:06,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 13:51:08,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:51:08,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:51:08,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 13:51:08,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:51:08,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 13:51:12,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:51:12,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:51:14,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:51:14,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:14,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:51:18,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:51:18,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:51:21,625 INFO [train.py:1039] (2/4) Epoch 11, batch 4600, loss[loss=0.1767, simple_loss=0.2134, pruned_loss=0.06997, over 19279.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2684, pruned_loss=0.06526, over 4681612.89 frames. ], batch size: 389, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:51:21,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:23,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:25,952 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.102e+02 2.367e+02 2.907e+02 4.657e+02, threshold=4.735e+02, percent-clipped=1.0 2023-09-29 13:51:26,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:51:26,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:51:27,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:29,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 13:51:30,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:51:35,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:51:36,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:40,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:46,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 13:51:48,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:51,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:55,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:51:55,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:52:01,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 13:52:01,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:52:02,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:07,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:07,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:52:08,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:52:13,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 13:52:13,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:52:20,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:20,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:52:21,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:21,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 13:52:22,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:23,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 13:52:23,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:25,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:27,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:27,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:52:29,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:30,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 13:52:30,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 13:52:32,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 13:52:32,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:33,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:35,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:35,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:42,916 INFO [train.py:1039] (2/4) Epoch 11, batch 4650, loss[loss=0.2023, simple_loss=0.2658, pruned_loss=0.06942, over 23741.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2676, pruned_loss=0.06481, over 4676689.81 frames. ], batch size: 179, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:52:46,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:52:47,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=385140.0, ans=0.125 2023-09-29 13:52:49,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:50,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:50,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:52:52,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:52,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:52,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:58,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 13:53:01,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.72 vs. limit=15.0 2023-09-29 13:53:01,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:53:05,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 13:53:05,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:53:05,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 13:53:05,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:53:06,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 13:53:06,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 13:53:06,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:06,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:53:09,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:53:11,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:11,477 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 13:53:14,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:16,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 13:53:19,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:19,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:53:20,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 13:53:22,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:53:25,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:53:29,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:34,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:37,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:40,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:40,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:53:41,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 13:53:41,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 13:53:43,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 13:53:43,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 13:53:45,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:53:51,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:53:51,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:53:51,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 13:53:51,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:52,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:52,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:53:54,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:53:57,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:53:57,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:59,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:54:02,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:02,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:54:04,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:54:04,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 13:54:06,314 INFO [train.py:1039] (2/4) Epoch 11, batch 4700, loss[loss=0.2056, simple_loss=0.2724, pruned_loss=0.06936, over 23694.00 frames. ], tot_loss[loss=0.2003, simple_loss=0.2692, pruned_loss=0.06567, over 4677837.99 frames. ], batch size: 256, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:54:06,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:54:06,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 13:54:11,822 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.872e+02 2.042e+02 2.233e+02 3.363e+02, threshold=4.084e+02, percent-clipped=0.0 2023-09-29 13:54:13,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:15,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:16,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:54:16,930 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:54:17,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:20,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:54:25,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 13:54:25,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=385540.0, ans=0.0 2023-09-29 13:54:27,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 13:54:30,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:31,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:54:31,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:54:37,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:44,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:54:45,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:54:47,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:54,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=385673.3333333333, ans=0.125 2023-09-29 13:54:55,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 13:54:55,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:54:58,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:00,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 13:55:03,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:06,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=385673.3333333333, ans=0.07 2023-09-29 13:55:07,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:55:07,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 13:55:08,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:08,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:12,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:55:12,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:55:12,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 13:55:12,927 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 13:55:14,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:14,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 13:55:16,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:22,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 13:55:26,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:55:27,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:27,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=385806.6666666667, ans=0.1 2023-09-29 13:55:28,544 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.78 vs. limit=15.0 2023-09-29 13:55:29,197 INFO [train.py:1039] (2/4) Epoch 11, batch 4750, loss[loss=0.1865, simple_loss=0.2595, pruned_loss=0.05674, over 24481.00 frames. ], tot_loss[loss=0.202, simple_loss=0.271, pruned_loss=0.06647, over 4685039.29 frames. ], batch size: 63, lr: 9.27e-03, grad_scale: 16.0 2023-09-29 13:55:32,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:33,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:55:35,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 13:55:35,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:55:39,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 13:55:40,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:55:40,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:43,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:47,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 13:55:50,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.97 vs. limit=22.5 2023-09-29 13:55:51,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:55:53,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 13:55:53,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=385873.3333333333, ans=0.125 2023-09-29 13:55:54,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:56,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:57,918 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 13:55:57,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 13:55:59,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=385873.3333333333, ans=0.125 2023-09-29 13:56:04,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 13:56:09,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:12,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:14,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:56:14,538 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 13:56:14,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:18,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:56:20,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=386006.6666666667, ans=0.2 2023-09-29 13:56:21,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:56:21,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 13:56:23,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 13:56:23,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:56:23,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:56:24,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:24,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:56:26,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 13:56:27,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 13:56:31,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:33,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:56:33,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 13:56:35,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:56:36,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:38,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:56:39,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:39,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:56:42,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:44,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 13:56:44,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 13:56:46,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 13:56:48,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:56:49,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:50,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 13:56:51,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=386140.0, ans=0.125 2023-09-29 13:56:52,869 INFO [train.py:1039] (2/4) Epoch 11, batch 4800, loss[loss=0.2218, simple_loss=0.2914, pruned_loss=0.07607, over 23396.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2726, pruned_loss=0.06664, over 4689688.14 frames. ], batch size: 93, lr: 9.27e-03, grad_scale: 32.0 2023-09-29 13:56:54,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:54,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:57,598 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.978e+02 2.285e+02 2.567e+02 3.711e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 13:56:59,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:57:01,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:01,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:01,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=386140.0, ans=0.125 2023-09-29 13:57:02,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 13:57:03,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:57:04,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:57:05,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:57:09,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:12,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:12,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:57:14,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:14,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:57:15,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:17,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:18,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:22,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:57:26,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:57:28,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:30,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 13:57:30,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 13:57:31,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:31,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:57:33,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:57:33,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:33,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:57:36,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:57:36,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:41,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:41,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:43,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:57:45,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=386340.0, ans=0.125 2023-09-29 13:57:48,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 13:57:48,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:49,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:49,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:57:49,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:54,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:56,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:57:56,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:56,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:57:57,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:57:58,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:58:02,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:02,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:02,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:58:04,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 13:58:07,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 13:58:07,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:07,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:08,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:08,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:11,959 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.73 vs. limit=15.0 2023-09-29 13:58:12,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:58:13,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=386473.3333333333, ans=0.125 2023-09-29 13:58:14,015 INFO [train.py:1039] (2/4) Epoch 11, batch 4850, loss[loss=0.2261, simple_loss=0.2825, pruned_loss=0.08481, over 23806.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2728, pruned_loss=0.06657, over 4706159.60 frames. ], batch size: 179, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:58:22,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 13:58:24,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:27,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:27,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:58:27,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:30,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=386540.0, ans=0.125 2023-09-29 13:58:33,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:33,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:58:34,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:58:34,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 13:58:40,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:40,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=386540.0, ans=0.2 2023-09-29 13:58:43,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:58:43,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:58:45,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:58:45,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 13:58:47,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:47,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:51,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=386606.6666666667, ans=0.5 2023-09-29 13:58:52,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:52,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 13:58:52,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 13:58:53,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:59:00,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:59:00,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 13:59:01,547 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.77 vs. limit=15.0 2023-09-29 13:59:02,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:59:02,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:59:06,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:59:08,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 13:59:08,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:09,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 13:59:09,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:11,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:12,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 13:59:12,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=386673.3333333333, ans=0.125 2023-09-29 13:59:14,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=386673.3333333333, ans=0.125 2023-09-29 13:59:19,920 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.43 vs. limit=22.5 2023-09-29 13:59:22,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:30,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:59:30,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:30,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.79 vs. limit=15.0 2023-09-29 13:59:36,689 INFO [train.py:1039] (2/4) Epoch 11, batch 4900, loss[loss=0.2078, simple_loss=0.2774, pruned_loss=0.06914, over 23342.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2711, pruned_loss=0.06571, over 4707752.43 frames. ], batch size: 119, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:59:36,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 13:59:36,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:59:37,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=386806.6666666667, ans=0.2 2023-09-29 13:59:42,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:43,966 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.984e+02 2.247e+02 2.564e+02 4.606e+02, threshold=4.494e+02, percent-clipped=1.0 2023-09-29 13:59:44,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:44,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:59:47,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 13:59:51,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 13:59:56,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 13:59:57,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 13:59:58,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:59:58,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:58,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:59:58,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:58,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:59:58,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=386873.3333333333, ans=0.125 2023-09-29 14:00:00,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 14:00:00,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=386873.3333333333, ans=0.035 2023-09-29 14:00:03,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 14:00:04,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:00:06,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:00:08,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:00:09,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:00:09,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:11,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:11,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 14:00:13,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:00:14,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:00:16,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 14:00:16,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 14:00:19,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 14:00:21,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:00:21,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:00:21,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:00:23,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:23,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:00:23,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:00:24,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 14:00:27,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:29,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:00:31,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:00:33,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=387006.6666666667, ans=0.125 2023-09-29 14:00:33,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=387006.6666666667, ans=0.125 2023-09-29 14:00:34,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 14:00:34,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:00:36,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:00:37,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 14:00:45,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:47,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:00:49,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 14:00:49,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:00:49,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:00:54,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:57,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:00:57,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:00:59,087 INFO [train.py:1039] (2/4) Epoch 11, batch 4950, loss[loss=0.1825, simple_loss=0.2658, pruned_loss=0.04954, over 24614.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2697, pruned_loss=0.06507, over 4707936.02 frames. ], batch size: 68, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 14:00:59,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:59,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 14:01:00,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:01:03,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:03,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:01:04,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=387140.0, ans=0.0 2023-09-29 14:01:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 14:01:08,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 14:01:10,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:01:10,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 14:01:10,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:10,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:01:11,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:01:11,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:12,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=387140.0, ans=0.125 2023-09-29 14:01:14,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:14,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:01:16,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:01:17,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:20,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:20,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:01:20,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=387206.6666666667, ans=0.0 2023-09-29 14:01:24,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:01:29,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:32,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:01:33,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:33,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:36,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:01:36,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 14:01:37,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=387273.3333333333, ans=0.0 2023-09-29 14:01:38,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 14:01:41,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:43,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:01:43,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:01:43,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:01:45,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:01:45,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:01:48,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:50,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:01:51,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:01:53,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:53,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:55,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 14:01:55,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:01:56,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:02:01,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:03,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:02:03,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:02:03,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:05,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:02:05,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:02:08,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:02:09,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:02:09,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:02:10,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 14:02:10,841 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.37 vs. limit=15.0 2023-09-29 14:02:16,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:21,060 INFO [train.py:1039] (2/4) Epoch 11, batch 5000, loss[loss=0.2154, simple_loss=0.2782, pruned_loss=0.07634, over 23780.00 frames. ], tot_loss[loss=0.1989, simple_loss=0.269, pruned_loss=0.06442, over 4713129.48 frames. ], batch size: 212, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:02:21,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 14:02:21,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:02:27,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:27,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:29,466 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.930e+02 2.192e+02 2.539e+02 4.135e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 14:02:29,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 14:02:29,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 14:02:31,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:02:34,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=387473.3333333333, ans=0.0 2023-09-29 14:02:35,212 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.30 vs. limit=15.0 2023-09-29 14:02:35,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 14:02:37,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:02:37,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:02:37,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 14:02:37,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:37,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=387540.0, ans=0.0 2023-09-29 14:02:38,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:02:38,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 14:02:38,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:40,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:02:40,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 14:02:41,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 14:02:41,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:02:43,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 14:02:43,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:02:43,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:43,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:02:43,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 14:02:43,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 14:02:47,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 14:02:47,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:47,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:50,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 14:02:50,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:50,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:51,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:53,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:02:54,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.whiten.whitening_limit, batch_count=387606.6666666667, ans=12.0 2023-09-29 14:02:55,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 14:02:56,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:57,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:03:01,838 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 14:03:05,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:03:06,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:03:06,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:09,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 14:03:10,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:03:10,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:10,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:13,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 14:03:15,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:25,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 14:03:28,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:39,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:40,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:40,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:03:40,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:40,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:03:40,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:03:42,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:45,666 INFO [train.py:1039] (2/4) Epoch 11, batch 5050, loss[loss=0.2038, simple_loss=0.2903, pruned_loss=0.05865, over 24676.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.27, pruned_loss=0.06485, over 4704570.36 frames. ], batch size: 73, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:03:47,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:47,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 14:03:48,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:03:50,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:50,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:03:52,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 14:03:54,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:54,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:57,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:03:58,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:03:59,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:04:08,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 14:04:08,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:04:09,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:11,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 14:04:11,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:14,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:14,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:04:16,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:04:16,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 14:04:16,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 14:04:17,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:19,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:22,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:24,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 14:04:25,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:29,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 14:04:30,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:04:32,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:04:32,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:33,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:35,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:04:37,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:04:37,607 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.07 vs. limit=6.0 2023-09-29 14:04:38,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:38,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:04:38,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:04:39,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 14:04:40,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:04:41,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:46,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:46,679 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 14:04:46,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:04:48,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:04:50,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:50,486 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 14:04:54,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:54,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 14:04:54,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:54,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=388073.3333333333, ans=0.125 2023-09-29 14:04:58,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:58,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:58,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 14:05:02,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 14:05:05,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:05,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:06,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:05:08,083 INFO [train.py:1039] (2/4) Epoch 11, batch 5100, loss[loss=0.203, simple_loss=0.2815, pruned_loss=0.06221, over 23207.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2715, pruned_loss=0.06526, over 4704815.11 frames. ], batch size: 105, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:05:08,253 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 14:05:08,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=388140.0, ans=0.0 2023-09-29 14:05:11,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:05:14,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 14:05:14,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 14:05:15,740 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.127e+02 2.436e+02 3.285e+02, threshold=4.254e+02, percent-clipped=0.0 2023-09-29 14:05:15,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:17,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:05:20,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:05:22,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 14:05:22,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 14:05:29,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:05:30,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:05:33,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:33,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=388206.6666666667, ans=0.0 2023-09-29 14:05:35,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=388206.6666666667, ans=0.125 2023-09-29 14:05:36,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 14:05:36,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:38,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:05:38,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 14:05:41,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 14:05:43,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.98 vs. limit=15.0 2023-09-29 14:05:44,721 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 14:05:46,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:46,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 14:05:46,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 14:05:49,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:51,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=388273.3333333333, ans=0.0 2023-09-29 14:05:59,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:02,067 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.08 vs. limit=22.5 2023-09-29 14:06:02,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 14:06:02,846 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 14:06:02,871 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 14:06:05,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 14:06:05,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:06:08,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 14:06:12,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 14:06:16,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 14:06:17,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:06:19,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 14:06:22,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:06:22,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=388406.6666666667, ans=0.0 2023-09-29 14:06:23,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 14:06:27,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:06:28,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:06:28,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:06:30,041 INFO [train.py:1039] (2/4) Epoch 11, batch 5150, loss[loss=0.2082, simple_loss=0.281, pruned_loss=0.06767, over 24058.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2722, pruned_loss=0.06582, over 4698011.16 frames. ], batch size: 80, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:06:30,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:06:30,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:06:30,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:06:32,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 14:06:32,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 14:06:33,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 14:06:34,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:06:34,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 14:06:35,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:37,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:06:38,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:40,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:45,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:06:45,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 14:06:47,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:47,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:06:48,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:06:48,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:06:48,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:06:49,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=388540.0, ans=0.125 2023-09-29 14:06:50,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:06:50,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:06:52,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 14:06:53,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:06:54,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:06:55,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:06:58,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 14:06:58,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:07:04,325 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.80 vs. limit=10.0 2023-09-29 14:07:05,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:07:05,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 14:07:12,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:07:17,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:17,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=388606.6666666667, ans=0.1 2023-09-29 14:07:17,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=388606.6666666667, ans=0.1 2023-09-29 14:07:19,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:19,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=388673.3333333333, ans=0.125 2023-09-29 14:07:23,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:23,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:25,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 14:07:29,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:07:29,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=388673.3333333333, ans=0.05 2023-09-29 14:07:30,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:07:30,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:07:33,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:34,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:35,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 14:07:37,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=388740.0, ans=0.2 2023-09-29 14:07:40,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:42,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:07:46,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:47,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:07:49,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:07:49,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:07:49,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:07:49,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:07:53,711 INFO [train.py:1039] (2/4) Epoch 11, batch 5200, loss[loss=0.1779, simple_loss=0.2639, pruned_loss=0.04595, over 24494.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2722, pruned_loss=0.06536, over 4704882.86 frames. ], batch size: 66, lr: 9.24e-03, grad_scale: 16.0 2023-09-29 14:07:53,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:07:55,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:07:57,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:01,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=388806.6666666667, ans=0.1 2023-09-29 14:08:02,211 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.017e+02 2.564e+02 3.234e+02 5.917e+02, threshold=5.129e+02, percent-clipped=10.0 2023-09-29 14:08:02,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 14:08:02,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:08:02,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=388806.6666666667, ans=0.0 2023-09-29 14:08:04,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:08,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:08,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:08:08,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:11,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 14:08:13,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:08:13,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=388873.3333333333, ans=0.125 2023-09-29 14:08:15,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:16,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 14:08:18,873 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.05 vs. limit=6.0 2023-09-29 14:08:19,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:08:20,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:08:22,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 14:08:22,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 14:08:25,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 14:08:25,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:25,152 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 14:08:26,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:28,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:28,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:08:28,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 14:08:28,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=388940.0, ans=0.125 2023-09-29 14:08:29,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:08:32,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:32,722 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.68 vs. limit=15.0 2023-09-29 14:08:35,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 14:08:36,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 14:08:36,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 14:08:41,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 14:08:41,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:08:46,601 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.33 vs. limit=15.0 2023-09-29 14:08:48,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:08:49,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:08:51,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 14:08:51,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:52,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:08:52,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:52,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:08:57,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:08:57,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:09:01,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:09:02,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:02,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:09,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:10,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 14:09:11,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=389073.3333333333, ans=0.125 2023-09-29 14:09:12,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:09:12,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:09:14,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:14,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:09:15,396 INFO [train.py:1039] (2/4) Epoch 11, batch 5250, loss[loss=0.2061, simple_loss=0.279, pruned_loss=0.06656, over 24039.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.2712, pruned_loss=0.06483, over 4709201.54 frames. ], batch size: 80, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:09:17,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:09:17,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:09:20,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:20,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:09:22,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:09:26,453 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.05 vs. limit=10.0 2023-09-29 14:09:27,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:29,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:09:32,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:09:35,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:09:37,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 14:09:37,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:39,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:54,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=389273.3333333333, ans=0.125 2023-09-29 14:09:59,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.36 vs. limit=12.0 2023-09-29 14:10:01,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=389340.0, ans=0.2 2023-09-29 14:10:29,779 INFO [train.py:1039] (2/4) Epoch 11, batch 5300, loss[loss=0.1762, simple_loss=0.25, pruned_loss=0.05119, over 24301.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2696, pruned_loss=0.06488, over 4702555.59 frames. ], batch size: 56, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:10:36,666 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 2.030e+02 2.219e+02 2.602e+02 3.750e+02, threshold=4.437e+02, percent-clipped=0.0 2023-09-29 14:10:38,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=389473.3333333333, ans=0.125 2023-09-29 14:10:41,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=389473.3333333333, ans=0.04949747468305833 2023-09-29 14:10:41,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=389473.3333333333, ans=0.125 2023-09-29 14:10:42,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=389540.0, ans=0.2 2023-09-29 14:10:44,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:10:45,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 14:10:45,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 14:10:45,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:45,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:45,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:45,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:45,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:45,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:10:45,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:45,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:10:46,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:10:46,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 14:10:46,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 14:10:46,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 14:10:46,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:10:46,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 14:10:47,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 14:10:47,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:48,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:48,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:48,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:48,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:10:48,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:49,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:49,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:49,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:49,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:49,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:10:49,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:49,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:10:50,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 14:10:50,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:50,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:50,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 14:10:50,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 14:10:51,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:10:51,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:10:51,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 14:10:51,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 14:10:51,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:10:52,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:10:52,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:52,927 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 14:10:53,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 14:10:53,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:10:53,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:53,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 14:10:53,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 14:10:53,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 14:10:53,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:11:03,749 INFO [train.py:1039] (2/4) Epoch 12, batch 0, loss[loss=0.2012, simple_loss=0.2931, pruned_loss=0.05463, over 24281.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2931, pruned_loss=0.05463, over 24281.00 frames. ], batch size: 74, lr: 8.84e-03, grad_scale: 32.0 2023-09-29 14:11:03,749 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 14:11:19,086 INFO [train.py:1071] (2/4) Epoch 12, validation: loss=0.305, simple_loss=0.2807, pruned_loss=0.1647, over 1125622.00 frames. 2023-09-29 14:11:19,087 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 14:11:23,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 14:11:25,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:11:26,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:11:30,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:30,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:11:30,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:31,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 14:11:33,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 14:11:34,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:36,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:40,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:40,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:40,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:11:40,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:41,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 14:11:43,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:51,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:11:51,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:53,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 14:11:56,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=389686.6666666667, ans=0.95 2023-09-29 14:11:57,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:11:57,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:12:00,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:05,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:12:08,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:15,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 14:12:18,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 14:12:18,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:18,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:19,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:12:19,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:12:20,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=389753.3333333333, ans=0.125 2023-09-29 14:12:22,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 14:12:24,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:27,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:28,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=389820.0, ans=0.125 2023-09-29 14:12:31,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:12:33,287 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 14:12:36,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:12:39,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:40,547 INFO [train.py:1039] (2/4) Epoch 12, batch 50, loss[loss=0.1978, simple_loss=0.2676, pruned_loss=0.06396, over 23684.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2716, pruned_loss=0.06484, over 1073331.97 frames. ], batch size: 149, lr: 8.84e-03, grad_scale: 16.0 2023-09-29 14:12:42,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:42,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 14:12:42,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:12:42,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:12:42,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=389886.6666666667, ans=0.125 2023-09-29 14:12:44,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:46,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:47,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:50,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=389886.6666666667, ans=0.0 2023-09-29 14:12:52,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 14:12:52,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:54,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=389886.6666666667, ans=0.2 2023-09-29 14:12:57,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:12:58,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 14:13:01,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 14:13:02,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:13:03,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=389953.3333333333, ans=0.125 2023-09-29 14:13:04,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:04,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:05,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:07,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:13:07,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:13:07,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:13,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:15,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:16,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:13:17,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 14:13:20,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:13:20,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:13:20,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 14:13:22,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:23,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 14:13:33,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:13:33,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:33,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=390086.6666666667, ans=0.0 2023-09-29 14:13:35,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:36,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:36,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:40,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 14:13:40,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 14:13:42,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:42,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:43,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:43,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:45,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 14:13:45,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 14:13:48,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:13:49,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:49,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:13:49,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 14:13:49,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 14:13:51,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:53,162 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.076e+02 2.460e+02 3.514e+02 7.647e+02, threshold=4.919e+02, percent-clipped=15.0 2023-09-29 14:13:53,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:54,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:13:54,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:13:58,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:14:01,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:14:02,862 INFO [train.py:1039] (2/4) Epoch 12, batch 100, loss[loss=0.2808, simple_loss=0.3247, pruned_loss=0.1184, over 19940.00 frames. ], tot_loss[loss=0.2037, simple_loss=0.2738, pruned_loss=0.06684, over 1877488.89 frames. ], batch size: 388, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:14:05,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:05,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=390220.0, ans=0.2 2023-09-29 14:14:06,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 14:14:06,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:14:10,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:14:11,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:11,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:14:11,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:14:11,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:12,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=390220.0, ans=0.125 2023-09-29 14:14:15,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 14:14:16,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:14:18,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:18,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:18,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:20,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=390286.6666666667, ans=0.125 2023-09-29 14:14:23,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 14:14:24,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:26,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:26,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:14:28,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:14:30,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=390286.6666666667, ans=0.125 2023-09-29 14:14:31,487 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 14:14:31,512 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 14:14:34,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:14:34,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:14:36,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:14:39,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:41,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:45,904 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.73 vs. limit=22.5 2023-09-29 14:14:49,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:50,963 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 14:14:53,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:14:57,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:14:57,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:00,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:02,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:07,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:08,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:15:10,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:12,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:13,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:13,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:15:13,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:13,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 14:15:13,760 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 14:15:13,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:14,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=390486.6666666667, ans=0.125 2023-09-29 14:15:15,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:15:15,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:15,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:17,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 14:15:17,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:15:17,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:15:17,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:18,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:20,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:22,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:15:22,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:15:25,206 INFO [train.py:1039] (2/4) Epoch 12, batch 150, loss[loss=0.2237, simple_loss=0.2789, pruned_loss=0.08426, over 23798.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2731, pruned_loss=0.06668, over 2485482.80 frames. ], batch size: 179, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:15:25,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:30,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:30,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:15:30,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:34,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:35,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:36,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:15:38,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:43,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 14:15:43,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 14:15:43,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 14:15:46,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:15:46,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:15:48,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:49,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:49,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:49,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:49,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:51,818 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 14:15:54,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:00,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:05,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:16:06,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 14:16:09,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:16:09,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:09,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:12,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:16:14,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:16:15,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:16:18,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:19,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 14:16:24,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:25,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:25,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:16:25,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:16:29,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:30,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 14:16:34,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:16:34,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:16:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:37,589 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.933e+02 2.162e+02 2.482e+02 3.211e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-29 14:16:39,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:16:41,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 14:16:41,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:41,289 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 14:16:44,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:44,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=390820.0, ans=0.125 2023-09-29 14:16:47,332 INFO [train.py:1039] (2/4) Epoch 12, batch 200, loss[loss=0.2373, simple_loss=0.2928, pruned_loss=0.09092, over 23837.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.274, pruned_loss=0.06711, over 2972601.62 frames. ], batch size: 195, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:16:50,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:16:50,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:16:52,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 14:16:53,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:53,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:57,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 14:16:57,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:16:58,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:00,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:02,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=390953.3333333333, ans=0.0 2023-09-29 14:17:04,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:17:04,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:17:04,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:25,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:17:26,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:17:26,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:17:28,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:17:28,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:17:30,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:17:32,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:33,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:17:34,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:17:35,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:17:37,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 14:17:39,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:17:39,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:44,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:17:50,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:18:00,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:00,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:18:07,854 INFO [train.py:1039] (2/4) Epoch 12, batch 250, loss[loss=0.1968, simple_loss=0.2656, pruned_loss=0.06399, over 23571.00 frames. ], tot_loss[loss=0.2027, simple_loss=0.273, pruned_loss=0.06624, over 3369617.25 frames. ], batch size: 149, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:18:07,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:10,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 14:18:11,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:11,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:18:11,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:11,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:18:13,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 14:18:13,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:18:15,568 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 14:18:15,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:17,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:18:17,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=391220.0, ans=0.125 2023-09-29 14:18:17,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=391220.0, ans=0.0 2023-09-29 14:18:18,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:20,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:21,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:18:21,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:23,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:18:26,083 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.25 vs. limit=12.0 2023-09-29 14:18:30,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:18:40,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:18:42,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:43,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:18:49,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:18:50,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:18:50,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:18:52,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:52,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:18:52,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:18:54,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:54,536 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:18:57,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:19:00,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 14:19:00,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:19:04,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:19:04,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:19:04,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:19:05,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:05,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:19:05,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:19:09,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:09,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:19:10,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:14,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:19:16,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=391486.6666666667, ans=0.07 2023-09-29 14:19:17,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:19,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:19:22,364 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.950e+02 2.182e+02 2.665e+02 5.527e+02, threshold=4.363e+02, percent-clipped=2.0 2023-09-29 14:19:24,559 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.23 vs. limit=15.0 2023-09-29 14:19:25,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:26,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:19:29,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 14:19:31,303 INFO [train.py:1039] (2/4) Epoch 12, batch 300, loss[loss=0.1681, simple_loss=0.2416, pruned_loss=0.04728, over 24317.00 frames. ], tot_loss[loss=0.2011, simple_loss=0.2707, pruned_loss=0.06579, over 3651118.35 frames. ], batch size: 56, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:19:31,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:19:33,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:34,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 14:19:34,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:19:35,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=391553.3333333333, ans=0.1 2023-09-29 14:19:36,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:19:36,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 14:19:36,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=391553.3333333333, ans=0.2 2023-09-29 14:19:36,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=391553.3333333333, ans=0.125 2023-09-29 14:19:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:42,078 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:19:43,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:19:46,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:19:48,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 14:19:49,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:52,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:19:52,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 14:19:52,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:19:56,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:19:59,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:20:01,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 14:20:02,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 14:20:04,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:04,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:09,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:09,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 14:20:09,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:20:12,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:20:14,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:20:14,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:19,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:20:19,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 14:20:21,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:20:24,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:26,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 14:20:27,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:32,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:20:34,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:20:34,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 14:20:38,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:38,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:20:40,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:42,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:20:43,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 14:20:44,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:20:44,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:45,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 14:20:47,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:47,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:49,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:49,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=391820.0, ans=0.0 2023-09-29 14:20:49,789 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.05 vs. limit=6.0 2023-09-29 14:20:50,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:50,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:54,297 INFO [train.py:1039] (2/4) Epoch 12, batch 350, loss[loss=0.1975, simple_loss=0.2673, pruned_loss=0.06381, over 23375.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.269, pruned_loss=0.06503, over 3888006.49 frames. ], batch size: 105, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:20:55,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:20:55,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:20:59,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:03,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=16.41 vs. limit=15.0 2023-09-29 14:21:07,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:21:07,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=391886.6666666667, ans=0.125 2023-09-29 14:21:10,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:10,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:12,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 14:21:13,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:15,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 14:21:17,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:17,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 14:21:18,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:22,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 14:21:23,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:21:24,160 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.73 vs. limit=22.5 2023-09-29 14:21:27,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:29,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:21:29,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:29,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:31,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:21:31,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:31,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:21:32,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:21:32,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:40,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:21:40,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:21:40,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:21:42,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:46,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 14:21:46,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:52,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:52,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:21:53,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:55,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 14:21:57,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:21:57,141 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 14:22:00,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 14:22:00,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:05,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:22:05,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 14:22:07,472 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.933e+02 2.139e+02 2.473e+02 3.749e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-29 14:22:07,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:09,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:22:09,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=392153.3333333333, ans=0.04949747468305833 2023-09-29 14:22:10,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:10,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:10,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:13,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:14,687 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=15.0 2023-09-29 14:22:16,742 INFO [train.py:1039] (2/4) Epoch 12, batch 400, loss[loss=0.2074, simple_loss=0.2504, pruned_loss=0.08219, over 19188.00 frames. ], tot_loss[loss=0.1987, simple_loss=0.2681, pruned_loss=0.06465, over 4059016.50 frames. ], batch size: 388, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:22:16,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:22:18,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:22:20,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 14:22:20,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:20,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:23,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:22:23,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:24,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=392220.0, ans=0.5 2023-09-29 14:22:26,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:29,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:31,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 14:22:32,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 14:22:32,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:37,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 14:22:37,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:40,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:22:40,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:40,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 14:22:41,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:22:41,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:41,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:43,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:45,005 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 14:22:45,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=392286.6666666667, ans=0.015 2023-09-29 14:22:46,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 14:22:51,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:52,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:54,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 14:22:55,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 14:22:57,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:23:00,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:01,497 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.66 vs. limit=22.5 2023-09-29 14:23:08,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 14:23:12,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:23:13,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 14:23:15,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:23:18,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:23:18,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 14:23:21,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:23:24,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:23:26,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:23:27,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:27,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 14:23:29,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:23:30,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.05 vs. limit=15.0 2023-09-29 14:23:30,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 14:23:34,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:23:34,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:23:34,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=392486.6666666667, ans=0.1 2023-09-29 14:23:36,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 14:23:36,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=392553.3333333333, ans=0.125 2023-09-29 14:23:37,533 INFO [train.py:1039] (2/4) Epoch 12, batch 450, loss[loss=0.1954, simple_loss=0.2819, pruned_loss=0.05447, over 24616.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.268, pruned_loss=0.06386, over 4218537.80 frames. ], batch size: 73, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:23:39,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:23:39,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:23:39,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:23:43,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 14:23:43,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:23:44,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:23:46,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:23:46,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 14:23:47,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:23:48,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:23:51,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:24:00,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:00,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:02,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 14:24:03,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 14:24:08,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:24:11,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:12,923 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.38 vs. limit=15.0 2023-09-29 14:24:13,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:17,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:17,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:20,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 14:24:22,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 14:24:23,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 14:24:23,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:24:25,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:25,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:24:27,086 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 14:24:27,100 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 14:24:27,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=392753.3333333333, ans=0.2 2023-09-29 14:24:28,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:30,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:24:31,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:24:34,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:24:34,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:24:36,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:24:36,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 14:24:39,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:40,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:24:40,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:24:42,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 14:24:48,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:24:48,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 14:24:50,217 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.154e+02 2.453e+02 3.354e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 14:24:50,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 14:24:53,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:58,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:24:59,752 INFO [train.py:1039] (2/4) Epoch 12, batch 500, loss[loss=0.2085, simple_loss=0.2748, pruned_loss=0.0711, over 23518.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2693, pruned_loss=0.06432, over 4329700.62 frames. ], batch size: 285, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:24:59,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:00,535 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.34 vs. limit=15.0 2023-09-29 14:25:01,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:25:02,770 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 14:25:07,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:07,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:25:08,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:08,830 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 14:25:10,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 14:25:10,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:13,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:25:13,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=392953.3333333333, ans=0.1 2023-09-29 14:25:18,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:25:20,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:25:20,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:20,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:21,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:27,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.41 vs. limit=22.5 2023-09-29 14:25:29,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=393020.0, ans=0.125 2023-09-29 14:25:29,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=393020.0, ans=0.125 2023-09-29 14:25:31,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=393020.0, ans=0.2 2023-09-29 14:25:33,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:33,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:25:34,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:25:34,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:34,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 14:25:34,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:25:35,110 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:25:37,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:25:38,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:25:38,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=393020.0, ans=0.1 2023-09-29 14:25:39,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:25:39,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:39,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 14:25:39,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=393020.0, ans=0.0 2023-09-29 14:25:43,904 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 14:25:45,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:25:47,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:49,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:25:50,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 14:25:55,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:25:56,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:01,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:04,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:26:09,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:12,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 14:26:12,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:12,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:14,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=393153.3333333333, ans=0.125 2023-09-29 14:26:15,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 14:26:17,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:26:18,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:20,267 INFO [train.py:1039] (2/4) Epoch 12, batch 550, loss[loss=0.2829, simple_loss=0.3281, pruned_loss=0.1188, over 19645.00 frames. ], tot_loss[loss=0.2008, simple_loss=0.2711, pruned_loss=0.06528, over 4414762.74 frames. ], batch size: 389, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:26:22,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 14:26:25,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 14:26:25,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:25,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 14:26:27,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:26:27,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:28,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:26:28,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:26:32,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:32,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 14:26:34,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:26:39,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:26:39,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:42,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:26:44,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:49,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 14:26:50,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 14:26:52,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:26:56,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:26:56,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:26:59,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:27:03,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:03,531 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 14:27:03,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:27:05,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:27:07,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:27:08,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:27:08,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:27:10,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:11,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 14:27:12,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 14:27:13,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:13,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:27:13,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:27:13,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:27:16,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=393420.0, ans=0.1 2023-09-29 14:27:17,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:27:19,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:27:21,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:27:22,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:22,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 14:27:24,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:27:25,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:27,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:27:29,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:29,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:27:31,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:27:34,358 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.077e+02 2.338e+02 2.752e+02 4.124e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 14:27:34,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=393486.6666666667, ans=0.125 2023-09-29 14:27:37,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 14:27:41,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 14:27:43,115 INFO [train.py:1039] (2/4) Epoch 12, batch 600, loss[loss=0.182, simple_loss=0.2664, pruned_loss=0.04876, over 24655.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2706, pruned_loss=0.06458, over 4493474.17 frames. ], batch size: 68, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:27:43,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:27:44,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:27:44,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:51,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:27:54,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:27:55,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 14:27:58,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:28:00,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:01,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:05,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 14:28:05,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:28:10,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 14:28:13,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:28:13,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:13,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:28:19,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=393686.6666666667, ans=0.0 2023-09-29 14:28:19,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=393686.6666666667, ans=0.0 2023-09-29 14:28:20,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:28:20,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:28:22,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:22,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=393686.6666666667, ans=0.0 2023-09-29 14:28:24,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=393686.6666666667, ans=0.125 2023-09-29 14:28:28,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:28:32,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:33,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:33,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:35,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.32 vs. limit=15.0 2023-09-29 14:28:41,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 14:28:42,281 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.39 vs. limit=22.5 2023-09-29 14:28:46,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:28:46,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:28:52,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 14:28:52,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=393820.0, ans=0.125 2023-09-29 14:28:53,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:28:56,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 14:28:56,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:28:58,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:29:02,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.17 vs. limit=10.0 2023-09-29 14:29:03,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:29:04,489 INFO [train.py:1039] (2/4) Epoch 12, batch 650, loss[loss=0.1915, simple_loss=0.2323, pruned_loss=0.07532, over 19517.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2696, pruned_loss=0.06448, over 4536684.20 frames. ], batch size: 388, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:29:04,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:29:07,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:10,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:29:12,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:13,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.80 vs. limit=15.0 2023-09-29 14:29:15,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 14:29:16,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:29:22,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:29:22,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:23,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=393953.3333333333, ans=0.125 2023-09-29 14:29:26,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:29,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 14:29:31,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:29:32,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:35,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:29:35,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:29:39,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:39,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:40,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:29:40,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:42,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:29:44,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:29:45,575 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 14:29:45,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:45,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:29:49,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:49,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:49,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:29:51,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:29:51,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 14:29:53,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:29:54,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:55,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:29:57,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:58,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:30:00,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 14:30:00,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=394086.6666666667, ans=0.0 2023-09-29 14:30:02,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 14:30:02,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:02,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:30:03,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:30:03,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:30:05,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:30:12,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:12,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:14,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:30:17,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:17,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:30:17,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:20,381 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.997e+02 2.199e+02 2.485e+02 3.515e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 14:30:26,960 INFO [train.py:1039] (2/4) Epoch 12, batch 700, loss[loss=0.2089, simple_loss=0.3, pruned_loss=0.05891, over 24661.00 frames. ], tot_loss[loss=0.1989, simple_loss=0.2693, pruned_loss=0.06424, over 4576500.89 frames. ], batch size: 73, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:30:27,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:30:27,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:27,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:27,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:32,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 14:30:34,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 14:30:36,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 14:30:37,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:39,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:30:42,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 14:30:46,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:49,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:30:51,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:52,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=394286.6666666667, ans=0.2 2023-09-29 14:30:53,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:30:53,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:57,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:59,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:30:59,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:31:03,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 14:31:05,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 14:31:07,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=394353.3333333333, ans=0.09899494936611666 2023-09-29 14:31:08,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:31:09,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=394353.3333333333, ans=0.125 2023-09-29 14:31:10,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:31:12,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:31:16,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:31:18,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 14:31:21,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:21,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:31:21,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 14:31:25,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.01 vs. limit=6.0 2023-09-29 14:31:26,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:31:26,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:29,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:31:33,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=394486.6666666667, ans=0.1 2023-09-29 14:31:36,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:31:36,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 14:31:40,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 14:31:41,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 14:31:43,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=394486.6666666667, ans=0.0 2023-09-29 14:31:44,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:46,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:31:46,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:31:48,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:48,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 14:31:49,996 INFO [train.py:1039] (2/4) Epoch 12, batch 750, loss[loss=0.1914, simple_loss=0.2723, pruned_loss=0.0553, over 24290.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2687, pruned_loss=0.06432, over 4608113.32 frames. ], batch size: 74, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:31:53,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 14:31:53,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 14:31:53,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 14:31:54,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 14:31:56,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 14:31:56,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:31:56,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 14:31:57,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:59,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:32:00,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:02,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:03,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:32:05,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:08,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:32:10,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:32:11,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:32:13,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:13,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:15,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 14:32:16,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:32:18,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:20,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:23,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:32:25,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 14:32:25,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:32:25,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 14:32:25,643 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 14:32:27,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 14:32:27,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:32:27,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:32:28,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=394686.6666666667, ans=0.2 2023-09-29 14:32:30,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:32:30,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=394686.6666666667, ans=0.125 2023-09-29 14:32:34,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=394686.6666666667, ans=6.0 2023-09-29 14:32:37,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:32:37,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:37,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:32:40,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:41,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:41,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 14:32:43,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:32:45,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:32:47,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:32:49,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:32:50,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 14:32:50,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:51,403 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.84 vs. limit=15.0 2023-09-29 14:32:55,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:32:57,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:32:57,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:00,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:33:02,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 14:33:02,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:03,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:06,846 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.038e+02 2.318e+02 2.858e+02 4.234e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 14:33:09,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:09,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:33:13,485 INFO [train.py:1039] (2/4) Epoch 12, batch 800, loss[loss=0.2039, simple_loss=0.2774, pruned_loss=0.06518, over 23454.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2692, pruned_loss=0.06404, over 4631402.11 frames. ], batch size: 120, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:33:22,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:22,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:23,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:23,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:24,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=394886.6666666667, ans=0.0 2023-09-29 14:33:25,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:26,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:27,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:32,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:32,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:33:32,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=394953.3333333333, ans=0.125 2023-09-29 14:33:34,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 14:33:35,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:36,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:36,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:33:37,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:38,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 14:33:38,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:38,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 14:33:40,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=394953.3333333333, ans=0.0 2023-09-29 14:33:43,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:46,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:46,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:48,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:52,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:52,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:57,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:33:57,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:33:58,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 14:34:00,355 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 14:34:00,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 14:34:00,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:34:00,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:03,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:03,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:07,232 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 14:34:08,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 14:34:08,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:34:11,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:34:15,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:34:15,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=395086.6666666667, ans=0.0 2023-09-29 14:34:18,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:34:18,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=395153.3333333333, ans=0.125 2023-09-29 14:34:18,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=395153.3333333333, ans=0.95 2023-09-29 14:34:21,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 14:34:21,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:34:23,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=395153.3333333333, ans=0.125 2023-09-29 14:34:25,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 14:34:33,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:36,928 INFO [train.py:1039] (2/4) Epoch 12, batch 850, loss[loss=0.1847, simple_loss=0.2531, pruned_loss=0.05815, over 18739.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.271, pruned_loss=0.06471, over 4649720.64 frames. ], batch size: 40, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:34:37,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:34:37,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 14:34:37,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:34:37,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:40,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 14:34:40,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:40,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:34:42,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:45,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:34:47,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:48,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 14:34:48,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 14:34:48,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 14:34:50,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:50,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:34:53,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:53,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:53,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:35:00,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:00,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:00,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 14:35:06,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 14:35:09,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:10,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 14:35:15,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 14:35:16,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 14:35:18,879 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 14:35:18,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:18,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:35:20,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:35:21,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:23,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:24,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 14:35:26,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:26,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:28,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:35:28,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:35:31,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:35:32,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:35:32,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 14:35:32,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=395420.0, ans=0.125 2023-09-29 14:35:37,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:35:37,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:38,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:35:38,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:39,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.35 vs. limit=15.0 2023-09-29 14:35:40,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:40,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=395420.0, ans=0.0 2023-09-29 14:35:40,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=395420.0, ans=0.125 2023-09-29 14:35:43,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:44,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:35:46,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:35:48,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:35:48,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:35:53,429 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.869e+02 2.098e+02 2.387e+02 5.753e+02, threshold=4.196e+02, percent-clipped=1.0 2023-09-29 14:35:53,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:35:56,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:56,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 14:35:57,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:35:57,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:59,362 INFO [train.py:1039] (2/4) Epoch 12, batch 900, loss[loss=0.2641, simple_loss=0.3164, pruned_loss=0.1059, over 19417.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2727, pruned_loss=0.06547, over 4654531.74 frames. ], batch size: 389, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:36:00,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 14:36:07,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:36:08,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=395553.3333333333, ans=0.125 2023-09-29 14:36:11,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:11,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 14:36:13,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:36:14,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 14:36:16,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:36:16,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:36:16,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:16,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:36:17,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:36:24,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=395620.0, ans=0.0 2023-09-29 14:36:27,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:36:27,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:27,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:36:31,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:36,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 14:36:40,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:36:41,351 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.35 vs. limit=15.0 2023-09-29 14:36:43,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=395686.6666666667, ans=0.125 2023-09-29 14:36:44,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:36:46,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:36:47,922 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 14:36:49,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 14:36:54,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:36:55,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:36:55,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:36:55,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=395753.3333333333, ans=0.125 2023-09-29 14:37:00,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:00,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:03,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 14:37:04,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:37:04,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=395820.0, ans=0.125 2023-09-29 14:37:07,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 14:37:10,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:37:10,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:10,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:37:11,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:15,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 14:37:15,768 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 14:37:18,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:37:18,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=395820.0, ans=0.05 2023-09-29 14:37:19,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 14:37:20,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:22,865 INFO [train.py:1039] (2/4) Epoch 12, batch 950, loss[loss=0.2137, simple_loss=0.2753, pruned_loss=0.07602, over 23758.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2733, pruned_loss=0.06548, over 4661789.36 frames. ], batch size: 164, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:37:24,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 14:37:29,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:32,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:37:35,477 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 14:37:39,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:39,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:39,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:40,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:37:40,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 14:37:42,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:37:44,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=395953.3333333333, ans=0.015 2023-09-29 14:37:46,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:48,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 14:37:49,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:52,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:53,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:53,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:54,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 14:37:56,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:37:58,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:59,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:38:04,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:38:09,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 14:38:12,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:38:12,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:38:12,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:13,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:13,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:38:17,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=396086.6666666667, ans=0.0 2023-09-29 14:38:19,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 14:38:19,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:38:22,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:22,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:24,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 14:38:24,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:24,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:38:26,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 14:38:26,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=396086.6666666667, ans=0.125 2023-09-29 14:38:27,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=396153.3333333333, ans=0.125 2023-09-29 14:38:30,081 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:38:32,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:38:34,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:34,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=396153.3333333333, ans=0.0 2023-09-29 14:38:35,191 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-09-29 14:38:38,827 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.985e+02 2.175e+02 2.387e+02 3.582e+02, threshold=4.351e+02, percent-clipped=0.0 2023-09-29 14:38:39,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:38:42,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 14:38:42,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 14:38:44,930 INFO [train.py:1039] (2/4) Epoch 12, batch 1000, loss[loss=0.1992, simple_loss=0.2836, pruned_loss=0.0574, over 24524.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2721, pruned_loss=0.06514, over 4681132.58 frames. ], batch size: 71, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:38:46,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:48,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 14:38:49,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:55,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:38:57,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 14:38:57,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 14:39:04,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:04,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:39:04,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:09,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 14:39:09,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=396286.6666666667, ans=0.1 2023-09-29 14:39:10,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 14:39:13,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 14:39:14,481 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.36 vs. limit=15.0 2023-09-29 14:39:15,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:15,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 14:39:17,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 14:39:17,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 14:39:17,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:18,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:28,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:30,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:39:31,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:32,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 14:39:32,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:33,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:39:34,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:35,642 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 14:39:40,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 14:39:40,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 14:39:40,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=396420.0, ans=0.125 2023-09-29 14:39:42,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 14:39:44,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:39:52,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:39:52,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:39:55,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 14:39:56,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:39:56,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 14:39:58,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 14:40:00,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:00,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:40:02,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:40:02,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=396486.6666666667, ans=0.125 2023-09-29 14:40:04,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:40:07,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:08,957 INFO [train.py:1039] (2/4) Epoch 12, batch 1050, loss[loss=0.1923, simple_loss=0.2484, pruned_loss=0.06812, over 22884.00 frames. ], tot_loss[loss=0.1996, simple_loss=0.27, pruned_loss=0.06462, over 4683265.71 frames. ], batch size: 322, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:40:10,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:40:12,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:40:13,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:40:15,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:16,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=396553.3333333333, ans=0.125 2023-09-29 14:40:17,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=396553.3333333333, ans=0.1 2023-09-29 14:40:18,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:21,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:40:22,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=396553.3333333333, ans=0.125 2023-09-29 14:40:23,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:40:24,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:40:26,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:40:26,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:40:28,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:40:29,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 14:40:29,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:30,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 14:40:34,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:34,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 14:40:34,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:40:43,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:43,776 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.79 vs. limit=15.0 2023-09-29 14:40:44,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:40:44,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:46,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 14:40:47,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 14:40:47,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:51,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 14:40:55,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 14:40:55,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:58,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:41:00,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:41:00,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:00,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:41:05,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:41:08,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 14:41:09,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 14:41:10,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 14:41:10,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:10,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:41:14,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 14:41:17,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:41:20,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:20,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:41:22,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:22,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:24,160 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.921e+02 2.123e+02 2.375e+02 5.047e+02, threshold=4.247e+02, percent-clipped=1.0 2023-09-29 14:41:26,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:26,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 14:41:28,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:28,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 14:41:29,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 14:41:29,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:41:30,456 INFO [train.py:1039] (2/4) Epoch 12, batch 1100, loss[loss=0.2098, simple_loss=0.2901, pruned_loss=0.06471, over 24536.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2682, pruned_loss=0.06322, over 4700037.89 frames. ], batch size: 71, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:41:33,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:41:38,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:41:43,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:41:43,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=396886.6666666667, ans=0.0 2023-09-29 14:41:45,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:41:46,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:46,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 14:41:48,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:50,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:41:54,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:41:57,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:41:57,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 14:41:59,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:41:59,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:59,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:42:02,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:42:04,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:42:08,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:42:11,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 14:42:11,989 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 14:42:13,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:15,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=397020.0, ans=0.2 2023-09-29 14:42:16,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:16,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=397020.0, ans=0.1 2023-09-29 14:42:17,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:42:17,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:42:19,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 14:42:21,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:42:21,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:42:21,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:42:21,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=397086.6666666667, ans=0.0 2023-09-29 14:42:23,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:23,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 14:42:30,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:42:30,394 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:42:31,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 14:42:33,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:42:34,239 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.11 vs. limit=15.0 2023-09-29 14:42:38,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:42:41,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 14:42:41,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:42:41,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:42,371 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.65 vs. limit=15.0 2023-09-29 14:42:44,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:42:44,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:44,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 14:42:44,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:42:44,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:46,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 14:42:46,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:42:47,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 14:42:48,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:42:48,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:42:48,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=397153.3333333333, ans=0.2 2023-09-29 14:42:49,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:42:52,847 INFO [train.py:1039] (2/4) Epoch 12, batch 1150, loss[loss=0.2681, simple_loss=0.3112, pruned_loss=0.1125, over 19520.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2682, pruned_loss=0.06298, over 4705913.64 frames. ], batch size: 388, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:42:57,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:01,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:43:03,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:03,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:43:03,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 14:43:03,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:06,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 14:43:08,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:08,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:43:14,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 14:43:15,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:20,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:21,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:21,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 14:43:21,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:43:21,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:25,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 14:43:27,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:29,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:39,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:41,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=397420.0, ans=0.125 2023-09-29 14:43:43,140 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=397420.0, ans=0.1 2023-09-29 14:43:44,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=397420.0, ans=0.1 2023-09-29 14:43:45,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:45,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 14:43:47,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:47,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:52,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=397420.0, ans=0.1 2023-09-29 14:43:53,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=397420.0, ans=0.0 2023-09-29 14:43:54,852 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 14:43:56,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:44:03,156 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 14:44:08,546 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.949e+02 2.168e+02 2.522e+02 3.297e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 14:44:08,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:08,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:44:08,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:44:10,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:44:14,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:14,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=397553.3333333333, ans=0.125 2023-09-29 14:44:15,309 INFO [train.py:1039] (2/4) Epoch 12, batch 1200, loss[loss=0.223, simple_loss=0.2968, pruned_loss=0.07459, over 23989.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2688, pruned_loss=0.06345, over 4722486.83 frames. ], batch size: 80, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:44:19,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:44:19,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:44:21,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:21,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:23,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:44:24,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:44:26,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:44:29,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:29,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:44:32,231 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 14:44:32,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=397620.0, ans=0.04949747468305833 2023-09-29 14:44:34,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=397620.0, ans=0.2 2023-09-29 14:44:35,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 14:44:39,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:44:42,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:44:44,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:46,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:44:46,545 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 14:44:46,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=397686.6666666667, ans=0.125 2023-09-29 14:44:48,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:54,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:44:54,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:44:55,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 14:44:55,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:45:00,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 14:45:03,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 14:45:05,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:45:05,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:45:06,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:06,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:45:08,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:45:09,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:45:10,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:45:11,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 14:45:11,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:45:11,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:11,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:45:14,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:14,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:21,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:45:22,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:45:25,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 14:45:28,248 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.92 vs. limit=22.5 2023-09-29 14:45:30,627 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 14:45:30,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:45:33,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:35,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:45:36,978 INFO [train.py:1039] (2/4) Epoch 12, batch 1250, loss[loss=0.2189, simple_loss=0.2781, pruned_loss=0.07986, over 23567.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2695, pruned_loss=0.06338, over 4731082.79 frames. ], batch size: 256, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:45:37,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:40,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 14:45:43,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=397886.6666666667, ans=0.125 2023-09-29 14:45:46,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:45:48,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:48,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 14:45:51,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:45:52,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:45:57,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:45:58,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:58,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:45:58,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:46:01,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:46:05,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:46:05,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:05,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:07,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:46:07,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:12,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:46:12,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=398020.0, ans=0.125 2023-09-29 14:46:17,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 14:46:19,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:46:22,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:22,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 14:46:22,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:46:22,570 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 14:46:23,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:23,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:29,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:32,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:32,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:46:34,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 14:46:34,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 14:46:34,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 14:46:39,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:46:39,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 14:46:39,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:42,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:46:42,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:46:43,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 14:46:45,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:46:45,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:46:46,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:46:46,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:48,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 14:46:50,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:52,057 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.018e+02 2.304e+02 2.594e+02 4.435e+02, threshold=4.607e+02, percent-clipped=1.0 2023-09-29 14:46:52,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:46:53,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:46:55,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.74 vs. limit=10.0 2023-09-29 14:46:57,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:58,642 INFO [train.py:1039] (2/4) Epoch 12, batch 1300, loss[loss=0.1983, simple_loss=0.2524, pruned_loss=0.07209, over 22663.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2701, pruned_loss=0.0639, over 4723825.16 frames. ], batch size: 322, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:47:02,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:47:02,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 14:47:05,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:07,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:47:08,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:10,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:47:11,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:47:13,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 14:47:19,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:47:20,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:47:22,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 14:47:27,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:47:27,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=398286.6666666667, ans=0.125 2023-09-29 14:47:30,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:30,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:31,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.43 vs. limit=15.0 2023-09-29 14:47:33,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:36,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:38,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:47:38,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:47:38,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 14:47:43,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:47:43,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:47:44,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 14:47:46,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:47:47,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:47:50,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:50,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 14:47:52,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:52,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 14:47:53,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:55,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=398420.0, ans=0.0 2023-09-29 14:47:59,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:59,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:48:02,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 14:48:03,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 14:48:05,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 14:48:12,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:48:14,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 14:48:15,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:20,327 INFO [train.py:1039] (2/4) Epoch 12, batch 1350, loss[loss=0.2291, simple_loss=0.3022, pruned_loss=0.078, over 23671.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2694, pruned_loss=0.0636, over 4732613.11 frames. ], batch size: 85, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:48:20,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 14:48:23,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:25,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:30,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:30,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:32,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:48:33,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:36,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:38,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 14:48:40,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:48:42,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:48:45,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 14:48:45,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:48:47,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:48:47,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 14:48:47,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 14:48:48,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=398620.0, ans=0.125 2023-09-29 14:48:49,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 14:48:52,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:52,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 14:49:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:16,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:16,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:16,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 14:49:21,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:21,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 14:49:21,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:49:23,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:49:25,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:49:27,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 14:49:28,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:49:33,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 14:49:35,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 14:49:36,889 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.916e+02 2.100e+02 2.366e+02 3.252e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-29 14:49:38,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=398820.0, ans=0.04949747468305833 2023-09-29 14:49:42,904 INFO [train.py:1039] (2/4) Epoch 12, batch 1400, loss[loss=0.179, simple_loss=0.2192, pruned_loss=0.06941, over 19331.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2676, pruned_loss=0.06352, over 4715431.82 frames. ], batch size: 388, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:49:43,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 14:49:45,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:48,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:49:49,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:49:55,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 14:49:55,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=398886.6666666667, ans=0.2 2023-09-29 14:49:56,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.38 vs. limit=22.5 2023-09-29 14:49:57,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 14:50:07,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:50:08,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:11,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:50:11,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:50:14,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=399020.0, ans=0.125 2023-09-29 14:50:15,577 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.00 vs. limit=15.0 2023-09-29 14:50:16,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:50:18,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:50:27,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:29,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:33,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=399086.6666666667, ans=0.0 2023-09-29 14:50:34,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 14:50:34,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:50:34,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:50:36,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:50:37,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:39,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:50:39,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:50:39,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:50:41,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 14:50:41,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:50:41,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=399086.6666666667, ans=0.0 2023-09-29 14:50:47,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:50,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:50:57,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 14:50:58,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:51:00,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:51:02,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:51:02,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:04,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=399220.0, ans=0.125 2023-09-29 14:51:05,571 INFO [train.py:1039] (2/4) Epoch 12, batch 1450, loss[loss=0.187, simple_loss=0.2717, pruned_loss=0.05113, over 24634.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2674, pruned_loss=0.06331, over 4723113.24 frames. ], batch size: 68, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:51:05,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:51:09,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:51:12,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:51:12,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:12,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:51:17,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:18,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:51:20,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:51:20,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 14:51:22,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:51:22,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 14:51:23,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:23,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:23,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 14:51:24,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=399286.6666666667, ans=0.0 2023-09-29 14:51:27,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:28,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:51:29,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 14:51:29,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:29,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:51:30,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:33,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:39,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:51:39,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:51:42,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:43,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:44,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:45,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:51:45,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:46,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:51:49,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 14:51:53,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:56,338 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 14:51:57,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:51:59,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:52:01,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:02,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 14:52:07,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:09,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 14:52:10,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 14:52:12,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:14,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:14,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:52:17,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 14:52:18,750 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.44 vs. limit=15.0 2023-09-29 14:52:19,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 14:52:19,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 14:52:21,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:22,741 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.049e+02 2.244e+02 2.638e+02 4.746e+02, threshold=4.488e+02, percent-clipped=1.0 2023-09-29 14:52:24,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:52:29,522 INFO [train.py:1039] (2/4) Epoch 12, batch 1500, loss[loss=0.1868, simple_loss=0.2686, pruned_loss=0.0525, over 24650.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2681, pruned_loss=0.0634, over 4729807.72 frames. ], batch size: 65, lr: 8.73e-03, grad_scale: 32.0 2023-09-29 14:52:37,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 14:52:37,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:52:37,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:52:39,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:40,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:40,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:52:41,732 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.31 vs. limit=15.0 2023-09-29 14:52:42,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 14:52:42,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:52:43,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:52:43,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:43,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:45,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:52:47,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 14:52:54,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:52:56,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:52:57,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:53:02,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 14:53:05,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 14:53:07,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:53:07,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 14:53:10,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:53:13,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:13,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:53:13,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:53:15,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 14:53:16,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:53:16,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:16,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 14:53:17,395 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-09-29 14:53:18,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:23,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:53:23,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 14:53:28,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:53:30,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:53:35,691 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 14:53:35,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:37,070 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 14:53:38,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:53:38,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:53:40,143 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 14:53:41,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:53:44,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 14:53:45,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:47,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=399820.0, ans=0.2 2023-09-29 14:53:47,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=399820.0, ans=0.0 2023-09-29 14:53:50,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:50,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:50,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:51,394 INFO [train.py:1039] (2/4) Epoch 12, batch 1550, loss[loss=0.2069, simple_loss=0.2676, pruned_loss=0.0731, over 23721.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2685, pruned_loss=0.06321, over 4724130.83 frames. ], batch size: 232, lr: 8.73e-03, grad_scale: 16.0 2023-09-29 14:53:51,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:51,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:53,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 14:53:54,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=399886.6666666667, ans=0.0 2023-09-29 14:53:55,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 14:53:55,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:53:56,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 14:53:57,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 14:54:00,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:00,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:01,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:01,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:54:03,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:03,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:07,894 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 14:54:07,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:07,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:54:10,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:54:12,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:54:12,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 14:54:15,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:15,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 14:54:15,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=399953.3333333333, ans=0.1 2023-09-29 14:54:16,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 14:54:16,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 14:54:16,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:16,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:26,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:54:28,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 14:54:28,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 14:54:31,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=400020.0, ans=0.1 2023-09-29 14:54:36,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=400020.0, ans=0.0 2023-09-29 14:54:37,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:38,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=400020.0, ans=0.2 2023-09-29 14:54:38,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=400020.0, ans=0.125 2023-09-29 14:54:40,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:42,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:54:42,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:54:42,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 14:54:49,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:54:50,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:53,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:54:55,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:54:55,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:55,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 14:54:57,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:54:59,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:54:59,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:59,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 14:54:59,217 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 14:55:02,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:02,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=400153.3333333333, ans=0.125 2023-09-29 14:55:09,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 14:55:11,985 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.422e+02 2.047e+02 2.446e+02 2.941e+02 5.003e+02, threshold=4.892e+02, percent-clipped=3.0 2023-09-29 14:55:12,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:13,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:55:13,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 14:55:16,841 INFO [train.py:1039] (2/4) Epoch 12, batch 1600, loss[loss=0.2446, simple_loss=0.3012, pruned_loss=0.09396, over 22747.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2701, pruned_loss=0.06422, over 4719190.18 frames. ], batch size: 322, lr: 8.72e-03, grad_scale: 32.0 2023-09-29 14:55:16,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:55:18,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:18,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:55:18,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:55:19,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:55:24,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:24,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 14:55:26,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 14:55:29,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 14:55:31,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:55:33,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 14:55:34,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:55:35,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:55:41,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:55:44,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 14:55:48,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:55:49,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 14:55:50,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:50,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 14:55:53,055 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.22 vs. limit=22.5 2023-09-29 14:55:57,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 14:56:04,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:04,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 14:56:06,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:06,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:06,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:56:09,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 14:56:13,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 14:56:13,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:56:13,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:14,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:16,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:56:16,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=400420.0, ans=0.0 2023-09-29 14:56:17,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:56:18,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:56:19,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:56:22,783 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:56:27,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:27,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:56:30,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 14:56:30,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:56:30,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 14:56:33,764 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-09-29 14:56:36,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:39,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:56:39,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:56:41,162 INFO [train.py:1039] (2/4) Epoch 12, batch 1650, loss[loss=0.2539, simple_loss=0.3083, pruned_loss=0.09979, over 19470.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2705, pruned_loss=0.06451, over 4720508.12 frames. ], batch size: 389, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:56:41,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 14:56:41,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 14:56:41,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 14:56:41,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 14:56:43,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=400553.3333333333, ans=0.1 2023-09-29 14:56:46,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:48,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:48,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:56:48,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:56:51,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:55,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 14:56:57,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:57,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:57,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:56:57,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:56:58,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 14:56:58,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 14:56:59,480 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=6.71 vs. limit=12.0 2023-09-29 14:57:06,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:57:07,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:57:08,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.36 vs. limit=12.0 2023-09-29 14:57:16,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 14:57:19,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:22,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 14:57:25,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:28,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:57:29,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.98 vs. limit=10.0 2023-09-29 14:57:29,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:57:29,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:31,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:57:31,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:34,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:57:36,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:36,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:36,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:36,647 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=400753.3333333333, ans=0.125 2023-09-29 14:57:37,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:39,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:57:42,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:44,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 14:57:46,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:47,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 14:57:47,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 14:57:47,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 14:57:47,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:49,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:57:50,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:51,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:51,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 14:57:51,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=400820.0, ans=0.125 2023-09-29 14:57:56,068 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=15.0 2023-09-29 14:57:56,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:58,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:57:58,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:59,740 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.879e+02 2.089e+02 2.451e+02 3.406e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 14:58:00,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 14:58:01,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=400886.6666666667, ans=0.02 2023-09-29 14:58:03,038 INFO [train.py:1039] (2/4) Epoch 12, batch 1700, loss[loss=0.1929, simple_loss=0.2494, pruned_loss=0.06817, over 23576.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.269, pruned_loss=0.06457, over 4707928.52 frames. ], batch size: 256, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:58:04,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:58:04,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:58:06,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 14:58:06,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:06,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:58:06,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:09,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:58:10,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:58:10,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 14:58:13,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:58:18,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:21,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:58:22,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.98 vs. limit=15.0 2023-09-29 14:58:24,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=400953.3333333333, ans=0.2 2023-09-29 14:58:24,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=400953.3333333333, ans=0.2 2023-09-29 14:58:27,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=400953.3333333333, ans=0.1 2023-09-29 14:58:29,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:58:29,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:58:29,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:29,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:58:33,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 14:58:35,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:58:35,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:37,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:58:38,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:58:41,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 14:58:41,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 14:58:43,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:45,972 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.00 vs. limit=22.5 2023-09-29 14:58:46,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 14:58:46,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:58:51,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.21 vs. limit=12.0 2023-09-29 14:58:52,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=401086.6666666667, ans=0.125 2023-09-29 14:58:55,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:58:57,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:58:58,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:59:00,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:59:00,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 14:59:00,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:59:02,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 14:59:02,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:02,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:02,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:07,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:07,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:59:08,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:08,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:59:08,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:12,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:13,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 14:59:17,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:18,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:19,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=401153.3333333333, ans=0.125 2023-09-29 14:59:19,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=401153.3333333333, ans=0.0 2023-09-29 14:59:20,774 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.03 vs. limit=15.0 2023-09-29 14:59:21,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 14:59:27,051 INFO [train.py:1039] (2/4) Epoch 12, batch 1750, loss[loss=0.2079, simple_loss=0.2958, pruned_loss=0.05999, over 24531.00 frames. ], tot_loss[loss=0.1978, simple_loss=0.2674, pruned_loss=0.06409, over 4689170.26 frames. ], batch size: 71, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 14:59:28,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:31,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:31,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:59:31,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=401220.0, ans=0.2 2023-09-29 14:59:33,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 14:59:33,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:36,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:59:36,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:37,670 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.63 vs. limit=6.0 2023-09-29 14:59:41,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 14:59:43,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:46,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 14:59:46,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:47,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:59:48,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=401286.6666666667, ans=0.125 2023-09-29 14:59:50,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=401286.6666666667, ans=0.125 2023-09-29 14:59:51,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 14:59:53,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 14:59:54,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:56,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 15:00:03,779 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.53 vs. limit=15.0 2023-09-29 15:00:04,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:00:06,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:06,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:13,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:13,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:13,842 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.49 vs. limit=10.0 2023-09-29 15:00:14,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=401420.0, ans=0.125 2023-09-29 15:00:16,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:00:17,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:19,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:19,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=401420.0, ans=0.125 2023-09-29 15:00:20,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:00:21,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 15:00:24,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:26,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 15:00:28,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:29,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:29,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:00:33,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:00:34,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 15:00:34,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:37,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:42,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:43,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:00:45,915 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.394e+02 1.915e+02 2.238e+02 2.656e+02 3.754e+02, threshold=4.475e+02, percent-clipped=0.0 2023-09-29 15:00:46,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:00:46,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 15:00:46,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:49,089 INFO [train.py:1039] (2/4) Epoch 12, batch 1800, loss[loss=0.1784, simple_loss=0.2488, pruned_loss=0.05406, over 24482.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2672, pruned_loss=0.06412, over 4691358.78 frames. ], batch size: 58, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:00:49,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:00:49,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:00:49,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:00:49,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:00:50,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:00:53,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=401553.3333333333, ans=0.125 2023-09-29 15:00:54,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:00:55,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:58,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:00:59,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:00,508 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.21 vs. limit=6.0 2023-09-29 15:01:02,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:01:03,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=401553.3333333333, ans=0.125 2023-09-29 15:01:04,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:01:07,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:10,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:11,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:12,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:01:14,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:01:14,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 15:01:15,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:19,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:24,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 15:01:25,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 15:01:25,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 15:01:27,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:29,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:29,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:01:29,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:01:29,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=401686.6666666667, ans=0.125 2023-09-29 15:01:39,677 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 15:01:39,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:01:41,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:42,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 15:01:43,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 15:01:44,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:01:46,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:01:46,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:01:50,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 15:01:59,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:01:59,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 15:01:59,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:59,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:59,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:02:01,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 15:02:04,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:02:04,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:08,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 15:02:08,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:11,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:11,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:02:11,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:13,211 INFO [train.py:1039] (2/4) Epoch 12, batch 1850, loss[loss=0.1729, simple_loss=0.25, pruned_loss=0.04787, over 17153.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2678, pruned_loss=0.06437, over 4692176.78 frames. ], batch size: 37, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:02:13,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:14,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:02:16,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=401886.6666666667, ans=0.0 2023-09-29 15:02:17,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:02:17,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:19,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:02:20,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:02:27,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:02:27,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 15:02:32,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 15:02:35,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 15:02:40,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:40,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 15:02:40,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:02:51,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:02:51,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 15:02:54,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:02:54,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:02:55,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=402020.0, ans=10.0 2023-09-29 15:02:57,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 15:02:57,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:57,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:03:00,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:03:03,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:03:07,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:07,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=402086.6666666667, ans=0.125 2023-09-29 15:03:11,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:03:11,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:11,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:03:11,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:15,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:16,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:03:21,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 15:03:23,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:23,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=402153.3333333333, ans=0.125 2023-09-29 15:03:25,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:03:26,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:03:26,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 15:03:26,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 15:03:28,126 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 15:03:28,256 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 15:03:29,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:03:29,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:03:29,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:30,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:31,391 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 15:03:31,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:03:31,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:32,704 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.242e+02 2.724e+02 5.145e+02, threshold=4.485e+02, percent-clipped=1.0 2023-09-29 15:03:32,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:03:34,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:03:34,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=402220.0, ans=0.0 2023-09-29 15:03:35,181 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.53 vs. limit=10.0 2023-09-29 15:03:35,681 INFO [train.py:1039] (2/4) Epoch 12, batch 1900, loss[loss=0.2085, simple_loss=0.273, pruned_loss=0.07199, over 23484.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2693, pruned_loss=0.06471, over 4696469.26 frames. ], batch size: 285, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:03:37,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:03:37,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 15:03:40,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:40,402 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 15:03:42,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:03:43,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:48,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:50,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:03:50,990 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 15:03:53,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 15:03:55,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:55,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:55,421 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 15:03:56,697 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 15:03:58,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 15:04:01,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:04:03,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 15:04:05,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 15:04:15,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 15:04:17,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 15:04:17,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:04:18,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=402353.3333333333, ans=22.5 2023-09-29 15:04:19,103 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 15:04:19,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 15:04:19,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 15:04:19,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 15:04:19,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:04:24,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 15:04:29,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:04:33,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:33,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 15:04:37,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:04:38,066 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.84 vs. limit=15.0 2023-09-29 15:04:40,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 15:04:40,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:46,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:04:46,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:04:46,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:04:48,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:04:49,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:04:49,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:04:49,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=402486.6666666667, ans=0.2 2023-09-29 15:04:51,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:04:54,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:04:54,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:04:56,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:04:56,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:56,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:58,338 INFO [train.py:1039] (2/4) Epoch 12, batch 1950, loss[loss=0.2085, simple_loss=0.2697, pruned_loss=0.0736, over 23775.00 frames. ], tot_loss[loss=0.2003, simple_loss=0.2701, pruned_loss=0.06525, over 4698214.32 frames. ], batch size: 150, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:04:58,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:05:04,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:05,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:05:05,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:05,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:05:10,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 15:05:10,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:05:10,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:13,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:14,395 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.65 vs. limit=15.0 2023-09-29 15:05:16,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:05:16,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:17,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:19,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:21,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:21,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:05:22,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:05:22,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:28,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:28,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=402620.0, ans=0.0 2023-09-29 15:05:31,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:05:31,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:31,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:05:31,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 15:05:33,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:05:33,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:05:33,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:38,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:41,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:05:45,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:05:48,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:05:49,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:05:49,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 15:05:49,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:05:54,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:54,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:05:55,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:02,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:02,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:02,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=402820.0, ans=0.035 2023-09-29 15:06:06,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:08,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:10,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=402820.0, ans=0.2 2023-09-29 15:06:11,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:06:12,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:13,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 15:06:13,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:06:13,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=402820.0, ans=0.0 2023-09-29 15:06:14,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:06:16,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 15:06:17,820 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.002e+02 2.294e+02 2.547e+02 3.463e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 15:06:19,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:20,843 INFO [train.py:1039] (2/4) Epoch 12, batch 2000, loss[loss=0.1722, simple_loss=0.2447, pruned_loss=0.04987, over 24305.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2711, pruned_loss=0.06536, over 4697592.89 frames. ], batch size: 56, lr: 8.70e-03, grad_scale: 32.0 2023-09-29 15:06:23,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:24,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:06:25,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:06:25,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:06:27,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:31,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 15:06:32,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:06:34,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:06:35,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 15:06:38,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:06:38,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:41,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:06:42,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 15:06:44,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:46,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:48,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:48,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 15:06:49,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:06:51,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 15:06:51,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:06:54,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:06:55,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:06:55,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:57,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:00,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:01,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 15:07:03,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 15:07:03,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:07:03,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:04,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=403020.0, ans=0.1 2023-09-29 15:07:06,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:10,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:07:10,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:11,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:07:11,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=403086.6666666667, ans=0.125 2023-09-29 15:07:13,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:13,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:13,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:13,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:14,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=403086.6666666667, ans=0.1 2023-09-29 15:07:15,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:19,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:19,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 15:07:19,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=403086.6666666667, ans=0.125 2023-09-29 15:07:24,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:07:24,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:07:32,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:33,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:07:33,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:07:35,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-09-29 15:07:37,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:38,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:42,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:44,185 INFO [train.py:1039] (2/4) Epoch 12, batch 2050, loss[loss=0.1717, simple_loss=0.2508, pruned_loss=0.04635, over 22761.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2694, pruned_loss=0.06482, over 4688804.58 frames. ], batch size: 50, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:07:45,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:52,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:55,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:07:56,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:57,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:00,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 15:08:00,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:08:01,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:02,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:08:08,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=403286.6666666667, ans=0.125 2023-09-29 15:08:10,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:10,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:13,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 15:08:14,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:18,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 15:08:18,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:20,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:22,428 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.61 vs. limit=15.0 2023-09-29 15:08:23,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:24,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:08:25,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:27,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:08:29,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:08:29,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:08:33,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:34,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=403420.0, ans=0.125 2023-09-29 15:08:35,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:08:36,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:08:37,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=403420.0, ans=0.2 2023-09-29 15:08:39,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:42,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:08:47,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:49,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 15:08:55,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:08:56,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:08:59,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:09:01,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 15:09:04,761 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.918e+02 2.083e+02 2.421e+02 3.715e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-29 15:09:06,253 INFO [train.py:1039] (2/4) Epoch 12, batch 2100, loss[loss=0.2114, simple_loss=0.2836, pruned_loss=0.06963, over 23437.00 frames. ], tot_loss[loss=0.198, simple_loss=0.2681, pruned_loss=0.06401, over 4689385.52 frames. ], batch size: 93, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:09:06,482 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 15:09:06,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:07,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:07,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:09,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:09:09,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 15:09:09,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 15:09:12,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:09:15,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:09:15,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:09:18,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:20,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:09:20,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 15:09:20,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:09:21,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 15:09:21,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 15:09:23,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:23,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:09:23,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 15:09:23,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:09:29,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 15:09:29,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:34,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:09:36,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:39,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:09:39,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 15:09:39,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:39,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 15:09:41,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 15:09:43,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:43,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 15:09:43,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 15:09:43,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 15:09:46,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:09:47,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:09:48,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=403686.6666666667, ans=0.125 2023-09-29 15:09:50,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:51,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=403686.6666666667, ans=0.2 2023-09-29 15:09:52,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:54,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:55,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:55,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 15:09:55,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:55,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:57,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:59,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 15:10:00,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 15:10:02,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 15:10:06,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:10:09,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:10:09,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 15:10:15,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:19,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:10:19,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:10:19,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:10:19,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 15:10:19,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:10:21,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:21,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:10:21,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.61 vs. limit=6.0 2023-09-29 15:10:22,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:10:22,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:25,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 15:10:27,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 15:10:27,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:28,568 INFO [train.py:1039] (2/4) Epoch 12, batch 2150, loss[loss=0.2003, simple_loss=0.2774, pruned_loss=0.06163, over 23997.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2669, pruned_loss=0.0631, over 4696033.74 frames. ], batch size: 86, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:10:28,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:10:28,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:10:30,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:10:30,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:10:37,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:10:37,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:39,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:40,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:10:40,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:40,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:10:46,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:47,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:10:47,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:10:51,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:51,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 15:10:55,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:10:57,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:10:59,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:59,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:10:59,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=403953.3333333333, ans=0.125 2023-09-29 15:11:00,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:00,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:11:01,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:01,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:11:02,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:11:02,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=404020.0, ans=0.125 2023-09-29 15:11:03,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 15:11:05,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:11:07,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:07,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:08,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:11:12,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:11:14,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:14,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:11:15,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:15,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 15:11:15,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:11:18,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:19,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:21,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:22,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:11:22,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:22,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=404086.6666666667, ans=0.125 2023-09-29 15:11:24,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:24,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 15:11:26,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 15:11:27,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:11:27,802 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 15:11:27,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:29,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:11:29,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 15:11:29,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:11:29,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 15:11:30,751 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 15:11:30,751 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 15:11:30,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 15:11:33,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:33,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:33,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:11:33,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:35,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:11:38,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:38,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:38,844 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:11:38,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=404153.3333333333, ans=0.0 2023-09-29 15:11:48,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:11:49,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 15:11:50,277 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.936e+02 2.140e+02 2.613e+02 4.157e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 15:11:51,873 INFO [train.py:1039] (2/4) Epoch 12, batch 2200, loss[loss=0.1834, simple_loss=0.2625, pruned_loss=0.05216, over 24658.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2671, pruned_loss=0.06303, over 4706560.36 frames. ], batch size: 68, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:11:52,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:11:59,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:59,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:12:01,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:02,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:12:04,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:12:04,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:12:04,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 15:12:09,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 15:12:11,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:12:18,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 15:12:22,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:24,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:24,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:12:27,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:12:28,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 15:12:29,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=404353.3333333333, ans=0.05 2023-09-29 15:12:32,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:12:33,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:34,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:12:37,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:12:37,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:42,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:12:43,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:45,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 15:12:46,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:47,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 15:12:48,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=404420.0, ans=0.0 2023-09-29 15:12:50,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:50,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:12:51,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=404420.0, ans=0.025 2023-09-29 15:12:52,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:54,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:55,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:55,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:55,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:56,891 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.74 vs. limit=15.0 2023-09-29 15:12:57,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:12:57,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:13:00,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:13:03,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:13:04,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:07,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:13:08,836 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 15:13:10,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=404486.6666666667, ans=0.125 2023-09-29 15:13:12,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:13:12,420 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 15:13:13,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:13:14,009 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 15:13:14,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:15,596 INFO [train.py:1039] (2/4) Epoch 12, batch 2250, loss[loss=0.2196, simple_loss=0.283, pruned_loss=0.07806, over 23650.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2687, pruned_loss=0.06375, over 4699884.39 frames. ], batch size: 232, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:13:15,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:13:17,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:18,901 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 15:13:19,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.61 vs. limit=15.0 2023-09-29 15:13:20,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:13:23,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:23,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=404553.3333333333, ans=0.1 2023-09-29 15:13:24,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.28 vs. limit=22.5 2023-09-29 15:13:29,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:13:29,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:13:32,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:34,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:35,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:39,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 15:13:39,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:13:39,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:13:40,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 15:13:40,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:13:42,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:43,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:50,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:51,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:13:52,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:13:52,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=404686.6666666667, ans=0.0 2023-09-29 15:13:54,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 15:13:56,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:56,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:13:58,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=404686.6666666667, ans=0.125 2023-09-29 15:14:03,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:05,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:06,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:14:06,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:14:09,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:14:11,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:14:16,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:14:16,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=404753.3333333333, ans=0.125 2023-09-29 15:14:18,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:14:23,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:14:23,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:14:23,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:14:29,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:14:31,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:14:31,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 15:14:31,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:31,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=404820.0, ans=0.125 2023-09-29 15:14:32,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:14:32,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=404820.0, ans=0.1 2023-09-29 15:14:33,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=404820.0, ans=0.2 2023-09-29 15:14:34,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 15:14:36,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.000e+02 2.393e+02 2.961e+02 4.518e+02, threshold=4.786e+02, percent-clipped=2.0 2023-09-29 15:14:38,328 INFO [train.py:1039] (2/4) Epoch 12, batch 2300, loss[loss=0.1883, simple_loss=0.2654, pruned_loss=0.05558, over 23968.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2699, pruned_loss=0.06452, over 4701709.83 frames. ], batch size: 86, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:14:38,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:14:38,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:43,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=404886.6666666667, ans=0.1 2023-09-29 15:14:45,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:46,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:14:49,634 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 15:14:51,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:14:55,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=404953.3333333333, ans=0.1 2023-09-29 15:15:00,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:15:00,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:15:00,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:02,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:02,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 15:15:02,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:15:04,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=404953.3333333333, ans=0.0 2023-09-29 15:15:05,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:05,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:15:08,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:15:13,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:15:15,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:20,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:15:20,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:21,579 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.11 vs. limit=22.5 2023-09-29 15:15:23,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:15:26,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:15:30,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:32,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:15:32,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:15:32,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 15:15:36,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:15:36,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:36,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:36,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:15:37,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=405086.6666666667, ans=0.0 2023-09-29 15:15:38,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:38,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:15:38,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:15:39,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 15:15:39,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:15:39,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:41,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 15:15:50,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:15:51,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:15:57,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:58,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:15:58,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:15:58,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:16:00,272 INFO [train.py:1039] (2/4) Epoch 12, batch 2350, loss[loss=0.1969, simple_loss=0.2559, pruned_loss=0.06892, over 23741.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2706, pruned_loss=0.06461, over 4711767.61 frames. ], batch size: 164, lr: 8.67e-03, grad_scale: 8.0 2023-09-29 15:16:00,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:00,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:16:01,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 15:16:07,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:07,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 15:16:13,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 15:16:16,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:16:21,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:23,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:24,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 15:16:28,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:16:35,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 15:16:35,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:38,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:16:38,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:39,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:16:42,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 15:16:42,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:16:44,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:44,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:16:44,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:16:46,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:16:49,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 15:16:50,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:55,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:55,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:16:55,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 15:16:55,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=405420.0, ans=0.125 2023-09-29 15:16:56,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:16:59,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 15:17:00,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:17:00,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=405420.0, ans=0.025 2023-09-29 15:17:06,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 15:17:11,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 15:17:12,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:17:12,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:17:14,154 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 15:17:14,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 15:17:17,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 15:17:19,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=405486.6666666667, ans=0.2 2023-09-29 15:17:20,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:17:22,272 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.128e+02 2.368e+02 3.787e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 15:17:22,315 INFO [train.py:1039] (2/4) Epoch 12, batch 2400, loss[loss=0.2085, simple_loss=0.2711, pruned_loss=0.07295, over 23746.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2698, pruned_loss=0.06456, over 4717573.29 frames. ], batch size: 212, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:17:25,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:17:27,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:17:30,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:17:30,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 15:17:30,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 15:17:40,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:17:40,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:17:40,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 15:17:42,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:17:43,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:43,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 15:17:47,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=405620.0, ans=0.0 2023-09-29 15:17:48,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:50,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 15:17:52,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=405620.0, ans=0.0 2023-09-29 15:17:56,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:18:02,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 15:18:05,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:07,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:12,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:12,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 15:18:12,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:18:20,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:23,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:18:26,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:26,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:18:26,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:18:26,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=405820.0, ans=0.125 2023-09-29 15:18:28,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:18:28,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:28,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:18:28,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=405820.0, ans=0.1 2023-09-29 15:18:33,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:18:35,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:18:35,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 15:18:37,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 15:18:39,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:39,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:39,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 15:18:41,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 15:18:41,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 15:18:41,417 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 15:18:43,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 15:18:44,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:46,496 INFO [train.py:1039] (2/4) Epoch 12, batch 2450, loss[loss=0.219, simple_loss=0.2874, pruned_loss=0.07531, over 23312.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2685, pruned_loss=0.0645, over 4699631.86 frames. ], batch size: 105, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:18:46,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:46,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:46,712 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 15:18:48,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:49,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:18:54,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:18:54,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:57,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:57,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:59,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 15:19:01,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=405953.3333333333, ans=0.125 2023-09-29 15:19:05,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:05,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:08,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:19:08,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:19:08,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:19:08,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 15:19:13,100 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.03 vs. limit=22.5 2023-09-29 15:19:14,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:17,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:19:19,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:19:21,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:19:21,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:24,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:24,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:19:26,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 15:19:27,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:19:32,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:34,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:35,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:35,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:19:35,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:37,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:19:38,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 15:19:40,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:40,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:19:45,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:19:45,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:51,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:19:51,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 15:19:53,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:19:53,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:54,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.17 vs. limit=15.0 2023-09-29 15:19:54,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 15:19:54,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:19:56,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:20:00,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:20:03,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:03,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:20:07,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 15:20:08,463 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.958e+02 2.224e+02 2.647e+02 3.379e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 15:20:08,506 INFO [train.py:1039] (2/4) Epoch 12, batch 2500, loss[loss=0.2126, simple_loss=0.2892, pruned_loss=0.06794, over 23683.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.2672, pruned_loss=0.06361, over 4700391.79 frames. ], batch size: 85, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:20:08,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:20:15,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:25,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:20:25,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:20:26,327 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.89 vs. limit=15.0 2023-09-29 15:20:26,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:26,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 15:20:33,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:20:34,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:20:36,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:20:36,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:20:36,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 15:20:39,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:39,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:39,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 15:20:39,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:39,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=406353.3333333333, ans=0.125 2023-09-29 15:20:40,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 15:20:41,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:20:46,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=406353.3333333333, ans=0.5 2023-09-29 15:20:47,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:47,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:51,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:20:51,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 15:20:51,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=406353.3333333333, ans=0.0 2023-09-29 15:20:53,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:20:55,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:58,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:03,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:07,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:13,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:21:15,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 15:21:15,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=406486.6666666667, ans=0.0 2023-09-29 15:21:16,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:16,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:19,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:21:19,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:21:21,170 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 15:21:21,170 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 15:21:21,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 15:21:24,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:26,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 15:21:26,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 15:21:29,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:29,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 15:21:31,827 INFO [train.py:1039] (2/4) Epoch 12, batch 2550, loss[loss=0.1916, simple_loss=0.2607, pruned_loss=0.06126, over 23415.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2679, pruned_loss=0.06376, over 4705415.07 frames. ], batch size: 119, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:21:33,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 15:21:37,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:38,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:21:38,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:21:41,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:43,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 15:21:43,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:21:43,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=406553.3333333333, ans=0.125 2023-09-29 15:21:46,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 15:21:48,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:21:50,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=406620.0, ans=0.125 2023-09-29 15:21:51,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:52,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 15:21:52,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:21:52,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:21:54,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:56,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=406620.0, ans=0.125 2023-09-29 15:21:56,976 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.93 vs. limit=15.0 2023-09-29 15:21:57,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:21:57,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 15:21:57,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:57,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:57,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 15:21:59,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=406620.0, ans=0.125 2023-09-29 15:22:14,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:22:18,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:18,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:18,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:22:19,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:22:26,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:22:30,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:22:30,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:22:30,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:22:31,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:22:31,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:22:37,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:37,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:41,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:22:41,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 15:22:41,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:22:43,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:44,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:22:46,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:22:46,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:52,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:22:53,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=406886.6666666667, ans=0.2 2023-09-29 15:22:54,379 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.975e+02 2.307e+02 2.705e+02 3.697e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-29 15:22:54,422 INFO [train.py:1039] (2/4) Epoch 12, batch 2600, loss[loss=0.2026, simple_loss=0.2661, pruned_loss=0.06956, over 23681.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2692, pruned_loss=0.06361, over 4710315.67 frames. ], batch size: 232, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:22:54,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:57,870 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 15:23:02,202 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 15:23:02,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:23:02,299 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 15:23:03,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 15:23:03,809 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 15:23:06,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:23:06,879 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 15:23:07,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=406886.6666666667, ans=0.1 2023-09-29 15:23:09,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 15:23:11,283 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 15:23:12,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:23:12,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=406953.3333333333, ans=0.5 2023-09-29 15:23:14,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 15:23:17,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 15:23:19,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:23:19,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 15:23:22,918 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 15:23:22,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 15:23:30,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:30,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:30,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:30,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 15:23:33,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:23:39,952 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 15:23:45,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:45,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:47,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 15:23:49,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:23:49,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:50,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 15:23:51,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=407086.6666666667, ans=0.125 2023-09-29 15:23:53,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:23:54,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:23:58,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:02,626 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 15:24:02,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:02,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:24:07,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:24:08,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:24:08,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 15:24:10,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:24:11,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:11,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:16,570 INFO [train.py:1039] (2/4) Epoch 12, batch 2650, loss[loss=0.1966, simple_loss=0.2604, pruned_loss=0.06634, over 23637.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2699, pruned_loss=0.0639, over 4718734.78 frames. ], batch size: 149, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:24:16,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 15:24:18,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:21,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:24:27,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 15:24:27,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:28,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:24:28,619 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 15:24:28,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:24:30,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:32,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:24:34,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:37,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:37,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 15:24:38,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:24:38,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:24:41,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 15:24:43,142 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 15:24:44,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:47,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 15:24:47,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:24:49,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 15:24:55,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:55,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:24:55,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:57,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:01,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 15:25:01,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 15:25:06,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:08,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 15:25:08,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:10,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:11,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:11,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:11,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:12,367 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.79 vs. limit=15.0 2023-09-29 15:25:14,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:16,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:16,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:25:16,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:25:17,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:25:19,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:20,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:25:21,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:23,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:24,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:25:24,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=407486.6666666667, ans=0.125 2023-09-29 15:25:24,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=407486.6666666667, ans=0.0 2023-09-29 15:25:25,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:27,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:25:27,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:27,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 15:25:32,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:34,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:34,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:35,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=407486.6666666667, ans=0.125 2023-09-29 15:25:37,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:38,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:39,912 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.894e+02 2.126e+02 2.422e+02 3.822e+02, threshold=4.252e+02, percent-clipped=0.0 2023-09-29 15:25:39,958 INFO [train.py:1039] (2/4) Epoch 12, batch 2700, loss[loss=0.1975, simple_loss=0.282, pruned_loss=0.05647, over 24658.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2715, pruned_loss=0.06481, over 4713637.63 frames. ], batch size: 68, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:25:40,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:40,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=407553.3333333333, ans=0.125 2023-09-29 15:25:41,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:25:41,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=407553.3333333333, ans=0.125 2023-09-29 15:25:42,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 15:25:46,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:25:46,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=407553.3333333333, ans=0.1 2023-09-29 15:25:47,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:25:49,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:49,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:49,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:50,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:25:50,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:50,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:25:50,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:25:52,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 15:25:52,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:25:53,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:55,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:25:55,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=407620.0, ans=0.0 2023-09-29 15:25:56,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:57,240 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:26:00,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:26:00,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 15:26:02,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:07,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:26:07,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:14,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:26:14,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:26:14,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:26:14,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:26:18,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:19,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:19,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:26:19,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:26:25,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:25,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:26:27,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=407753.3333333333, ans=0.2 2023-09-29 15:26:27,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=407753.3333333333, ans=0.2 2023-09-29 15:26:33,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=407753.3333333333, ans=0.125 2023-09-29 15:26:35,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:26:35,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:26:40,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:26:40,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:44,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:44,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:45,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.89 vs. limit=15.0 2023-09-29 15:26:46,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:48,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:26:49,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:51,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:26:52,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:54,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:54,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:56,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 15:26:57,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:59,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:26:59,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 15:27:00,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 15:27:00,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:02,306 INFO [train.py:1039] (2/4) Epoch 12, batch 2750, loss[loss=0.2059, simple_loss=0.2772, pruned_loss=0.06737, over 24148.00 frames. ], tot_loss[loss=0.2001, simple_loss=0.2709, pruned_loss=0.06467, over 4711623.22 frames. ], batch size: 86, lr: 8.64e-03, grad_scale: 16.0 2023-09-29 15:27:03,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:04,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=407886.6666666667, ans=0.125 2023-09-29 15:27:06,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:08,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:08,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:27:09,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:12,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:12,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=407886.6666666667, ans=0.1 2023-09-29 15:27:14,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:27:14,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:27:14,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:14,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 15:27:14,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:27:14,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:21,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 15:27:23,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:27:23,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:24,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:27:26,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:27:27,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:27,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:27:27,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:29,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:32,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:27:32,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:27:32,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:27:34,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:34,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:27:36,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=408020.0, ans=0.125 2023-09-29 15:27:43,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:46,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:27:46,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:49,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=408020.0, ans=0.5 2023-09-29 15:27:50,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:50,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:27:51,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:27:59,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:28:00,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:28:00,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 15:28:05,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:08,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 15:28:11,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=408153.3333333333, ans=0.125 2023-09-29 15:28:13,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:28:14,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:28:14,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=408153.3333333333, ans=0.0 2023-09-29 15:28:16,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 15:28:18,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:28:18,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:28:20,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 15:28:20,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:28:24,684 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.991e+02 2.173e+02 2.887e+02 4.719e+02, threshold=4.346e+02, percent-clipped=2.0 2023-09-29 15:28:24,726 INFO [train.py:1039] (2/4) Epoch 12, batch 2800, loss[loss=0.2035, simple_loss=0.2647, pruned_loss=0.07111, over 23693.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.269, pruned_loss=0.0648, over 4691413.26 frames. ], batch size: 164, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:28:24,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:28:24,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:24,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:28:26,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 15:28:26,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:27,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:29,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:30,022 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 15:28:30,023 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 15:28:33,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:33,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=408220.0, ans=0.0 2023-09-29 15:28:36,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:28:36,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:28:39,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:28:41,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 15:28:44,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:28:44,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 15:28:45,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:45,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:28:45,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:28:49,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:28:49,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:49,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:28:52,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:29:00,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:29:02,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:04,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=408353.3333333333, ans=0.0 2023-09-29 15:29:05,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:05,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:29:06,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:07,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=408353.3333333333, ans=0.0 2023-09-29 15:29:10,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=408353.3333333333, ans=0.125 2023-09-29 15:29:13,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:13,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 15:29:14,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:14,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:14,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:29:15,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=408420.0, ans=0.0 2023-09-29 15:29:19,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:20,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:23,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=408420.0, ans=0.125 2023-09-29 15:29:24,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:26,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:29:27,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:27,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:29:28,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:29:28,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:29:30,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:29:30,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 15:29:30,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:31,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:29:31,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:33,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 15:29:35,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:35,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:29:36,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:29:38,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 15:29:40,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=408486.6666666667, ans=0.125 2023-09-29 15:29:45,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:45,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:29:45,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:29:45,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=408486.6666666667, ans=0.125 2023-09-29 15:29:48,304 INFO [train.py:1039] (2/4) Epoch 12, batch 2850, loss[loss=0.1842, simple_loss=0.2588, pruned_loss=0.0548, over 17602.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2674, pruned_loss=0.06369, over 4686891.78 frames. ], batch size: 38, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:29:48,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:29:51,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:29:51,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:51,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=408553.3333333333, ans=0.0 2023-09-29 15:29:53,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:56,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:56,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:59,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:30:00,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 15:30:05,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=408620.0, ans=0.125 2023-09-29 15:30:06,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 15:30:06,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:08,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 15:30:10,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:12,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 15:30:14,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 15:30:15,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:29,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:29,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=408686.6666666667, ans=0.125 2023-09-29 15:30:31,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:31,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:30:32,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:30:32,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:30:33,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:30:36,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:30:36,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 15:30:38,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:30:38,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:30:38,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:38,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:40,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=408753.3333333333, ans=0.0 2023-09-29 15:30:41,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:41,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:43,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:45,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:48,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:30:48,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:50,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:53,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:30:53,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=408820.0, ans=0.0 2023-09-29 15:30:57,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:30:59,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 15:31:00,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 15:31:02,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:31:02,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:02,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=408820.0, ans=0.0 2023-09-29 15:31:03,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 15:31:03,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:31:05,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:05,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:05,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:31:05,247 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 15:31:07,272 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 15:31:07,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:07,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:10,763 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.902e+02 2.094e+02 2.569e+02 4.916e+02, threshold=4.188e+02, percent-clipped=1.0 2023-09-29 15:31:10,807 INFO [train.py:1039] (2/4) Epoch 12, batch 2900, loss[loss=0.227, simple_loss=0.2664, pruned_loss=0.09378, over 19102.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2674, pruned_loss=0.06318, over 4688213.93 frames. ], batch size: 388, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:31:12,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:12,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:12,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:31:14,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 15:31:17,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:19,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 15:31:21,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 15:31:22,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:31:22,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:31:26,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:26,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:31:31,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:31,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:34,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:31:35,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 15:31:35,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:31:37,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:40,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 15:31:40,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 15:31:45,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:45,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 15:31:45,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:31:47,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:31:48,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-09-29 15:31:49,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:52,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:53,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:57,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:02,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:02,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 15:32:04,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 15:32:04,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:32:08,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:32:10,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 15:32:11,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:32:16,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:32:24,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:32:24,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:32:25,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 15:32:27,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:27,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 15:32:28,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:28,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:32:34,388 INFO [train.py:1039] (2/4) Epoch 12, batch 2950, loss[loss=0.1968, simple_loss=0.2601, pruned_loss=0.06675, over 23597.00 frames. ], tot_loss[loss=0.197, simple_loss=0.268, pruned_loss=0.06303, over 4700988.84 frames. ], batch size: 256, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:32:34,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:37,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 15:32:37,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:37,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:39,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:32:40,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:32:42,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 15:32:42,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 15:32:43,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:32:43,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:47,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=409220.0, ans=0.0 2023-09-29 15:32:50,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:32:52,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:32:55,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:55,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:32:58,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:32:58,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:32:58,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=409286.6666666667, ans=0.125 2023-09-29 15:33:02,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:03,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:03,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:33:05,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 15:33:09,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 15:33:09,438 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 15:33:10,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:33:12,467 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 15:33:15,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 15:33:15,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:33:16,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:33:16,841 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 15:33:16,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:33:19,098 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.44 vs. limit=12.0 2023-09-29 15:33:19,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 15:33:20,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:33:20,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:33:23,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:25,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:33:26,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:27,375 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 15:33:27,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:27,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 15:33:27,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.75 vs. limit=15.0 2023-09-29 15:33:34,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:37,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:33:38,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 15:33:38,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:33:38,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 15:33:41,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:43,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:33:45,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:33:46,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:46,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:33:46,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:33:48,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:48,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:33:50,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:33:50,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:51,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:33:53,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:53,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 15:33:54,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:56,516 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.935e+02 2.190e+02 2.674e+02 3.950e+02, threshold=4.379e+02, percent-clipped=0.0 2023-09-29 15:33:56,559 INFO [train.py:1039] (2/4) Epoch 12, batch 3000, loss[loss=0.1746, simple_loss=0.2431, pruned_loss=0.05304, over 24483.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2683, pruned_loss=0.06313, over 4708294.43 frames. ], batch size: 58, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:33:56,559 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 15:34:11,471 INFO [train.py:1071] (2/4) Epoch 12, validation: loss=0.2606, simple_loss=0.2686, pruned_loss=0.1263, over 1125622.00 frames. 2023-09-29 15:34:11,472 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 15:34:14,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:34:14,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:34:15,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=409553.3333333333, ans=0.2 2023-09-29 15:34:19,192 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 15:34:19,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 15:34:20,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:34:20,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:34:22,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 15:34:22,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:29,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:34:39,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:34:47,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 15:34:47,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:34:50,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:34:52,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:52,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:34:53,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:34:53,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 15:34:56,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 15:34:58,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:34:58,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:35:02,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:35:02,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:04,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:04,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:35:07,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:35:08,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:35:08,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:35:10,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:13,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 15:35:15,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:35:15,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=409820.0, ans=0.015 2023-09-29 15:35:16,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:16,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:35:20,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:22,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:23,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:35:23,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 15:35:23,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:35:25,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 15:35:26,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:35:28,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 15:35:31,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:35:32,695 INFO [train.py:1039] (2/4) Epoch 12, batch 3050, loss[loss=0.1852, simple_loss=0.2713, pruned_loss=0.04951, over 24471.00 frames. ], tot_loss[loss=0.1975, simple_loss=0.269, pruned_loss=0.06304, over 4718810.77 frames. ], batch size: 69, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:35:32,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:35:34,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 15:35:34,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 15:35:34,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:35:35,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:35:36,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:37,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:35:37,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:37,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:35:39,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 15:35:41,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:35:42,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:35:43,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:35:43,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=409886.6666666667, ans=0.125 2023-09-29 15:35:48,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:51,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 15:35:58,289 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=12.0 2023-09-29 15:35:59,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 15:35:59,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 15:35:59,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:01,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:36:01,854 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.07 vs. limit=22.5 2023-09-29 15:36:04,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:05,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:05,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:07,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:08,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:36:08,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:10,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:10,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:12,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:13,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:16,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:17,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 15:36:17,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:17,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:36:20,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:36:20,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:36:22,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:36:22,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:28,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:28,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:37,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:37,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:36:37,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:40,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:40,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:36:42,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:42,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 15:36:44,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:44,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:47,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 15:36:50,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:54,185 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.862e+02 2.008e+02 2.262e+02 2.937e+02, threshold=4.017e+02, percent-clipped=0.0 2023-09-29 15:36:54,229 INFO [train.py:1039] (2/4) Epoch 12, batch 3100, loss[loss=0.1887, simple_loss=0.2654, pruned_loss=0.05593, over 24345.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.269, pruned_loss=0.06358, over 4711409.62 frames. ], batch size: 61, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:36:54,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:54,813 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:36:56,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:36:59,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:37:01,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 15:37:03,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 15:37:05,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 15:37:05,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=410220.0, ans=0.125 2023-09-29 15:37:07,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:37:10,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:37:10,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:12,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:37:15,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:15,931 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.30 vs. limit=22.5 2023-09-29 15:37:17,104 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=410286.6666666667, ans=0.1 2023-09-29 15:37:23,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 15:37:26,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:37:27,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:27,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:27,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:37:29,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:37:30,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:37:30,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 15:37:30,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:37:32,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:32,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 15:37:33,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:37:39,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:37:40,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 15:37:41,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 15:37:42,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:43,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:45,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:45,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:46,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:37:46,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:37:46,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:37:50,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:37:50,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:37:50,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:50,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 15:37:56,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:56,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=410420.0, ans=0.125 2023-09-29 15:37:57,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 15:37:58,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:37:59,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 15:37:59,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:59,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:01,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 15:38:06,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=410486.6666666667, ans=0.0 2023-09-29 15:38:12,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 15:38:14,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:15,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:16,981 INFO [train.py:1039] (2/4) Epoch 12, batch 3150, loss[loss=0.1721, simple_loss=0.2575, pruned_loss=0.04334, over 24644.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2674, pruned_loss=0.06297, over 4703932.88 frames. ], batch size: 68, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:38:17,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:38:17,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:38:17,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 15:38:18,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:20,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:38:22,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 15:38:23,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:27,333 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 15:38:29,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 15:38:31,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:38:32,536 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 15:38:34,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:38:34,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 15:38:35,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 15:38:35,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 15:38:35,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:35,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:38:37,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:40,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 15:38:40,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=410620.0, ans=0.125 2023-09-29 15:38:41,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:41,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:44,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:45,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:38:49,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 15:38:50,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:38:53,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:38:53,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:53,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 15:38:57,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=410686.6666666667, ans=0.0 2023-09-29 15:38:58,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 15:38:58,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:39:00,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:39:00,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:39:00,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:00,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:39:02,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:39:02,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:39:04,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 15:39:06,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:39:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:07,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:39:07,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:39:07,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 15:39:07,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:09,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 15:39:10,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:11,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 15:39:12,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 15:39:12,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:39:14,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:14,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 15:39:16,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:39:17,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:19,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:39:21,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:21,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:39:27,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:39:27,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:27,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=410820.0, ans=0.125 2023-09-29 15:39:31,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 15:39:37,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:39:37,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:39:41,205 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.037e+02 2.402e+02 2.745e+02 4.943e+02, threshold=4.804e+02, percent-clipped=1.0 2023-09-29 15:39:41,247 INFO [train.py:1039] (2/4) Epoch 12, batch 3200, loss[loss=0.1973, simple_loss=0.2619, pruned_loss=0.0663, over 23818.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2668, pruned_loss=0.06226, over 4698253.48 frames. ], batch size: 164, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:39:42,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:43,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:39:43,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 15:39:46,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:48,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=410886.6666666667, ans=0.0 2023-09-29 15:39:49,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:39:52,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:40:00,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=410953.3333333333, ans=0.125 2023-09-29 15:40:00,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=410953.3333333333, ans=0.2 2023-09-29 15:40:03,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:40:14,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 15:40:15,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:40:18,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 15:40:19,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=411020.0, ans=0.125 2023-09-29 15:40:20,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:40:22,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=411020.0, ans=0.125 2023-09-29 15:40:23,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:40:23,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:40:25,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:40:29,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 15:40:31,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:40:33,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 15:40:35,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 15:40:38,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:40:43,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:40:45,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,389 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 15:40:45,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:40:48,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=411153.3333333333, ans=22.5 2023-09-29 15:40:49,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:40:50,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 15:40:52,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 15:40:54,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 15:40:54,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 15:40:55,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:41:00,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:41:00,130 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 15:41:00,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:00,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:01,669 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 15:41:03,201 INFO [train.py:1039] (2/4) Epoch 12, batch 3250, loss[loss=0.2043, simple_loss=0.2661, pruned_loss=0.07127, over 23803.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.267, pruned_loss=0.06224, over 4700577.81 frames. ], batch size: 212, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:41:06,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:41:08,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=411220.0, ans=0.1 2023-09-29 15:41:10,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:10,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=411220.0, ans=0.2 2023-09-29 15:41:19,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:41:19,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 15:41:20,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:20,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:41:20,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:23,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:23,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:41:25,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:26,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:41:26,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:26,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:26,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:28,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:41:30,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:32,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:33,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=411286.6666666667, ans=0.125 2023-09-29 15:41:34,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:35,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:37,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:37,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:37,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:41:43,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 15:41:44,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:45,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:41:46,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:47,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:41:52,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:42:00,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:00,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:00,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 15:42:00,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:42:00,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:42:00,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:04,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 15:42:04,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 15:42:06,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:42:07,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:07,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:09,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:42:09,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:14,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:14,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:17,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 15:42:17,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:20,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:42:20,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 15:42:21,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=411486.6666666667, ans=0.2 2023-09-29 15:42:22,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=411486.6666666667, ans=0.1 2023-09-29 15:42:25,042 INFO [train.py:1039] (2/4) Epoch 12, batch 3300, loss[loss=0.1844, simple_loss=0.2506, pruned_loss=0.05909, over 23483.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.267, pruned_loss=0.06208, over 4706354.76 frames. ], batch size: 119, lr: 8.61e-03, grad_scale: 16.0 2023-09-29 15:42:25,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:25,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 15:42:26,587 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.957e+02 2.272e+02 2.906e+02 4.656e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 15:42:26,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 15:42:28,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 15:42:28,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:28,540 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:42:33,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:34,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:42:36,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:36,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=411553.3333333333, ans=0.125 2023-09-29 15:42:38,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:42:38,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:42:40,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:42,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:45,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=411620.0, ans=0.0 2023-09-29 15:42:46,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 15:42:46,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:42:46,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:47,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=411620.0, ans=0.1 2023-09-29 15:42:48,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:48,564 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 15:42:50,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:42:50,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:42:51,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:42:51,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:42:53,196 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 15:42:55,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:55,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:42:58,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:58,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 15:42:58,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=411686.6666666667, ans=0.0 2023-09-29 15:43:01,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 15:43:01,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:02,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:43:04,551 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 15:43:06,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 15:43:08,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:43:11,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 15:43:13,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:15,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:43:16,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:20,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:20,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:20,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:43:20,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:43:24,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:43:24,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:26,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:43:27,580 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 15:43:28,105 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.53 vs. limit=12.0 2023-09-29 15:43:29,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 15:43:30,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:43:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:43:30,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:34,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:34,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:35,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:43:36,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:36,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:43:37,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:39,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:43:42,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 15:43:42,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:42,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=411820.0, ans=0.035 2023-09-29 15:43:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:47,607 INFO [train.py:1039] (2/4) Epoch 12, batch 3350, loss[loss=0.2121, simple_loss=0.2756, pruned_loss=0.07432, over 22757.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2678, pruned_loss=0.0627, over 4700669.22 frames. ], batch size: 322, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:43:47,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:43:47,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:49,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:50,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:50,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:54,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:56,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:57,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:44:00,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:02,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:44:03,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:05,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:44:05,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 15:44:05,385 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 15:44:06,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:08,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 15:44:08,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 15:44:12,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:44:12,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:44:12,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:12,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 15:44:13,288 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.37 vs. limit=12.0 2023-09-29 15:44:13,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:13,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:44:17,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:18,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:18,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:20,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:44:23,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:28,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:28,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:32,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:44:33,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:35,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:36,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:37,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:40,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 15:44:40,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:44:42,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 15:44:42,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:44:44,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 15:44:45,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:46,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:53,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:54,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.92 vs. limit=22.5 2023-09-29 15:44:55,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 15:44:55,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:44:55,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:44:56,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:45:02,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:05,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 15:45:05,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:45:05,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:45:08,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:08,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 15:45:09,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:45:09,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 15:45:11,200 INFO [train.py:1039] (2/4) Epoch 12, batch 3400, loss[loss=0.2137, simple_loss=0.292, pruned_loss=0.06775, over 24464.00 frames. ], tot_loss[loss=0.1978, simple_loss=0.2691, pruned_loss=0.06319, over 4705568.97 frames. ], batch size: 69, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:45:11,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:11,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:11,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:45:13,409 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.868e+02 2.094e+02 2.439e+02 4.049e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 15:45:13,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:45:13,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=412220.0, ans=0.1 2023-09-29 15:45:15,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 15:45:18,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 15:45:18,188 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 15:45:18,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:45:21,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:21,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:45:22,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:25,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:45:33,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:45:36,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 15:45:42,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:45:44,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:44,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:46,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:45:50,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:45:50,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=412353.3333333333, ans=0.125 2023-09-29 15:45:54,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 15:46:00,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:00,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:03,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 15:46:04,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:04,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:06,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:46:06,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:46:09,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:46:15,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:46:16,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:46:18,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=412486.6666666667, ans=0.125 2023-09-29 15:46:20,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:22,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 15:46:29,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:46:33,842 INFO [train.py:1039] (2/4) Epoch 12, batch 3450, loss[loss=0.2309, simple_loss=0.2759, pruned_loss=0.09297, over 19941.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2695, pruned_loss=0.06343, over 4706904.84 frames. ], batch size: 388, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:46:33,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 15:46:34,834 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.82 vs. limit=15.0 2023-09-29 15:46:37,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 15:46:37,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:39,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:46:41,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 15:46:42,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:44,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:46:51,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:46:52,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:46:52,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:46:52,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:54,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:47:00,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 15:47:07,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 15:47:07,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:47:07,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:47:10,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:14,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=412686.6666666667, ans=0.015 2023-09-29 15:47:16,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 15:47:16,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=412686.6666666667, ans=0.0 2023-09-29 15:47:16,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=412686.6666666667, ans=0.0 2023-09-29 15:47:17,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:47:21,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:21,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:47:22,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:47:24,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:47:27,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 15:47:27,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:27,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=412753.3333333333, ans=0.2 2023-09-29 15:47:28,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:47:31,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:47:34,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 15:47:38,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:47:43,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:47:46,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:49,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:47:53,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:53,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:55,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:47:57,178 INFO [train.py:1039] (2/4) Epoch 12, batch 3500, loss[loss=0.1949, simple_loss=0.2627, pruned_loss=0.06358, over 23192.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2681, pruned_loss=0.06321, over 4705189.31 frames. ], batch size: 105, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:47:57,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:58,601 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.065e+02 2.305e+02 4.202e+02, threshold=4.129e+02, percent-clipped=1.0 2023-09-29 15:48:01,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:04,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:48:05,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 15:48:07,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:48:11,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 15:48:13,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:13,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 15:48:16,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:48:18,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:48:20,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:48:20,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:20,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:48:20,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:22,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:22,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 15:48:27,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:27,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:48:27,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:27,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=412953.3333333333, ans=0.125 2023-09-29 15:48:32,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:34,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 15:48:34,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:37,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:37,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:48:38,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:41,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:48:41,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:42,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 15:48:44,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 15:48:45,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 15:48:45,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:47,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:48,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:48:52,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:48:53,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:48:57,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:48:59,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 15:48:59,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 15:48:59,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:02,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:04,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:05,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:06,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=413153.3333333333, ans=0.1 2023-09-29 15:49:07,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 15:49:07,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=413153.3333333333, ans=0.1 2023-09-29 15:49:08,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:10,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:49:10,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 15:49:14,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 15:49:17,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:18,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:18,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:18,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:20,011 INFO [train.py:1039] (2/4) Epoch 12, batch 3550, loss[loss=0.1643, simple_loss=0.2373, pruned_loss=0.04559, over 21983.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2667, pruned_loss=0.06295, over 4697195.03 frames. ], batch size: 48, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:49:21,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:49:33,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:33,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:49:39,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:49:39,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=413286.6666666667, ans=0.2 2023-09-29 15:49:40,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:49:42,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:43,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:49:43,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:49:46,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:46,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:49:48,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:48,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:49:50,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:49:55,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:49:55,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:56,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:49:56,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:58,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:49:58,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 15:49:58,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:01,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:03,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:50:10,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:10,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:50:11,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:13,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 15:50:14,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:50:15,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 15:50:17,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:50:18,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:50:18,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:50:21,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 15:50:23,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 15:50:30,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:35,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:35,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 15:50:41,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=413486.6666666667, ans=0.125 2023-09-29 15:50:43,734 INFO [train.py:1039] (2/4) Epoch 12, batch 3600, loss[loss=0.1953, simple_loss=0.2561, pruned_loss=0.0672, over 23428.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2659, pruned_loss=0.0622, over 4702959.83 frames. ], batch size: 285, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:50:43,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 15:50:43,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:50:43,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:50:45,968 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.995e+02 2.200e+02 2.637e+02 4.261e+02, threshold=4.399e+02, percent-clipped=1.0 2023-09-29 15:50:47,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:47,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:49,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:50:53,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:50:55,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:57,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:50:57,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:50:58,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:58,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 15:51:02,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:51:02,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:05,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:09,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:11,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:51:11,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:51:12,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 15:51:12,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:15,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:17,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:51:18,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:22,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:22,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:51:24,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 15:51:31,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:51:33,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:51:33,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 15:51:39,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:51:41,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=413753.3333333333, ans=0.0 2023-09-29 15:51:45,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:48,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:53,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=413820.0, ans=0.0 2023-09-29 15:51:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:51:56,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:51:56,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 15:51:58,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 15:51:59,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 15:52:02,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:52:03,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:52:04,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 15:52:05,929 INFO [train.py:1039] (2/4) Epoch 12, batch 3650, loss[loss=0.2063, simple_loss=0.2765, pruned_loss=0.06803, over 23446.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2662, pruned_loss=0.06198, over 4703955.81 frames. ], batch size: 93, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:52:05,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:06,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:52:06,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:07,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 15:52:07,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 15:52:10,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:52:13,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 15:52:13,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=413886.6666666667, ans=0.0 2023-09-29 15:52:17,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 15:52:19,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:52:22,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 15:52:24,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 15:52:29,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:52:29,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:52:29,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:52:32,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:52:34,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:34,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 15:52:34,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:52:36,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:36,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 15:52:37,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:52:39,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:52:39,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:42,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:52:43,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 15:52:45,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 15:52:46,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:52:47,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=414020.0, ans=0.125 2023-09-29 15:52:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 15:52:50,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:52:50,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:52:57,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:52:59,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:59,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:53:00,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:53:00,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=414086.6666666667, ans=0.0 2023-09-29 15:53:02,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:53:04,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:53:07,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:09,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:09,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:53:12,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:53:14,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:53:14,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:20,570 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 15:53:23,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:23,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:27,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:53:27,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:28,588 INFO [train.py:1039] (2/4) Epoch 12, batch 3700, loss[loss=0.1923, simple_loss=0.2727, pruned_loss=0.05592, over 24641.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2673, pruned_loss=0.06236, over 4714263.77 frames. ], batch size: 68, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:53:28,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:53:28,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:30,805 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.903e+02 2.176e+02 2.360e+02 3.995e+02, threshold=4.353e+02, percent-clipped=0.0 2023-09-29 15:53:31,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 15:53:31,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:32,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:53:36,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:36,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:53:39,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:39,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 15:53:39,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:39,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:53:41,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:53:45,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:53:48,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:49,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:49,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:53:51,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:51,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:53:52,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:54,487 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 15:54:04,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:54:05,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:54:07,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:54:07,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 15:54:07,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:10,395 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.33 vs. limit=15.0 2023-09-29 15:54:11,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:11,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 15:54:13,842 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.65 vs. limit=22.5 2023-09-29 15:54:14,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:14,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=414353.3333333333, ans=0.125 2023-09-29 15:54:16,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:54:17,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:17,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:54:20,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:54:24,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:26,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 15:54:26,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:54:27,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 15:54:28,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.19 vs. limit=15.0 2023-09-29 15:54:31,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:54:31,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:54:35,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:36,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 15:54:38,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:54:38,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:54:39,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:39,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:43,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:44,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 15:54:46,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 15:54:46,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:54:46,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:54:48,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:54:48,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:54:50,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=414486.6666666667, ans=0.2 2023-09-29 15:54:53,605 INFO [train.py:1039] (2/4) Epoch 12, batch 3750, loss[loss=0.2001, simple_loss=0.2798, pruned_loss=0.06022, over 24646.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2683, pruned_loss=0.06255, over 4718146.35 frames. ], batch size: 68, lr: 8.57e-03, grad_scale: 32.0 2023-09-29 15:54:53,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:55,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:54:56,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:54:58,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 15:54:58,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 15:55:01,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:55:03,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 15:55:03,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:55:04,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:04,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:06,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:11,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:14,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:55:16,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:55:21,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:55:22,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:23,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=414620.0, ans=0.0 2023-09-29 15:55:24,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 15:55:24,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:26,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:26,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:28,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 15:55:31,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=414686.6666666667, ans=0.125 2023-09-29 15:55:34,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 15:55:34,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:34,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:37,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:42,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:44,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:55:49,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 15:55:52,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:53,000 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=414753.3333333333, ans=0.125 2023-09-29 15:55:56,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:56,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:56:01,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:56:03,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:56:04,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:56:06,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:56:06,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=414820.0, ans=0.0 2023-09-29 15:56:07,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:56:09,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:56:15,927 INFO [train.py:1039] (2/4) Epoch 12, batch 3800, loss[loss=0.1919, simple_loss=0.2663, pruned_loss=0.0587, over 24313.00 frames. ], tot_loss[loss=0.1976, simple_loss=0.2687, pruned_loss=0.06325, over 4721801.84 frames. ], batch size: 61, lr: 8.57e-03, grad_scale: 8.0 2023-09-29 15:56:16,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=414886.6666666667, ans=0.0 2023-09-29 15:56:19,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:56:21,132 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.017e+02 2.225e+02 2.467e+02 3.965e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 15:56:24,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:24,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:56:25,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 15:56:26,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=414886.6666666667, ans=0.125 2023-09-29 15:56:27,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:29,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:31,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:56:33,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:56:33,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:34,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:56:36,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:36,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:56:37,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:39,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 15:56:39,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=414953.3333333333, ans=0.0 2023-09-29 15:56:43,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:56:44,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:56:46,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:49,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:56:49,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:56:49,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=415020.0, ans=0.1 2023-09-29 15:56:51,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:56:51,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=415020.0, ans=0.0 2023-09-29 15:56:52,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:54,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:55,338 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.67 vs. limit=10.0 2023-09-29 15:56:57,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:57:02,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:57:02,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 15:57:02,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=415020.0, ans=0.0 2023-09-29 15:57:03,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:11,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:15,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:57:19,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 15:57:21,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 15:57:21,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:21,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=415153.3333333333, ans=0.2 2023-09-29 15:57:24,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:24,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:25,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=415153.3333333333, ans=0.125 2023-09-29 15:57:26,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 15:57:31,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 15:57:31,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 15:57:31,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:32,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:33,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.26 vs. limit=10.0 2023-09-29 15:57:38,727 INFO [train.py:1039] (2/4) Epoch 12, batch 3850, loss[loss=0.1803, simple_loss=0.2589, pruned_loss=0.05091, over 24658.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2673, pruned_loss=0.06334, over 4705109.43 frames. ], batch size: 65, lr: 8.57e-03, grad_scale: 4.0 2023-09-29 15:57:38,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:57:40,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:57:45,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:57:47,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 15:57:47,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:57:47,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:51,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=415220.0, ans=0.125 2023-09-29 15:57:53,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:57:55,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:57,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:57:57,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 15:58:05,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:07,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:58:10,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:10,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:58:13,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:13,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:58:15,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:15,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:58:15,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:15,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=415353.3333333333, ans=0.125 2023-09-29 15:58:15,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=415353.3333333333, ans=0.05 2023-09-29 15:58:17,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:19,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:19,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:58:20,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 15:58:22,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 15:58:22,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:22,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:25,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:27,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:27,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 15:58:30,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 15:58:30,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=415420.0, ans=0.125 2023-09-29 15:58:32,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:34,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 15:58:35,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:58:42,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:43,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:46,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:47,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=415486.6666666667, ans=0.2 2023-09-29 15:58:48,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 15:58:50,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 15:58:53,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:55,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:57,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:58:57,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:58:58,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:00,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:00,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:59:00,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 15:59:01,982 INFO [train.py:1039] (2/4) Epoch 12, batch 3900, loss[loss=0.1752, simple_loss=0.2505, pruned_loss=0.05001, over 24561.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2659, pruned_loss=0.06245, over 4709728.60 frames. ], batch size: 60, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 15:59:02,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:59:03,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 15:59:03,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:03,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:04,497 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.66 vs. limit=15.0 2023-09-29 15:59:05,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:59:05,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:07,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:59:07,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:07,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:59:07,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:07,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 15:59:08,540 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.940e+02 2.154e+02 2.415e+02 3.457e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 15:59:08,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:12,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:15,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:15,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:59:16,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:18,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:18,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:21,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:59:21,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 15:59:21,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:25,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 15:59:25,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:27,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 15:59:30,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 15:59:33,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:35,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:35,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:59:35,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:59:38,241 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.80 vs. limit=15.0 2023-09-29 15:59:39,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=415686.6666666667, ans=0.1 2023-09-29 15:59:40,893 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:59:42,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:44,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:59:45,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:59:45,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:59:47,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:59:47,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=415686.6666666667, ans=0.125 2023-09-29 15:59:49,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=415686.6666666667, ans=0.0 2023-09-29 15:59:54,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:54,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:00:01,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=415753.3333333333, ans=0.125 2023-09-29 16:00:02,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:00:05,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:00:13,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:18,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:18,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 16:00:20,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 16:00:20,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:22,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 16:00:24,270 INFO [train.py:1039] (2/4) Epoch 12, batch 3950, loss[loss=0.1896, simple_loss=0.274, pruned_loss=0.05259, over 24282.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.265, pruned_loss=0.06202, over 4694995.66 frames. ], batch size: 74, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 16:00:24,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:00:24,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 16:00:31,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=415886.6666666667, ans=0.0 2023-09-29 16:00:32,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:00:33,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 16:00:33,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:00:35,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:00:37,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:00:44,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 16:00:45,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:45,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 16:00:45,856 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 16:00:45,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:48,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:48,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:00:48,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:49,335 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.31 vs. limit=6.0 2023-09-29 16:00:53,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 16:00:55,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:00:56,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:56,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:00:56,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:00:58,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:01:08,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:01:08,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:01:10,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=416020.0, ans=0.125 2023-09-29 16:01:10,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=416020.0, ans=0.125 2023-09-29 16:01:16,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 16:01:22,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 16:01:22,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 16:01:23,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:01:23,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:01:27,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=416086.6666666667, ans=0.125 2023-09-29 16:01:31,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:01:33,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:01:33,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:01:33,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:01:33,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 16:01:40,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:01:42,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:01:46,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 16:01:47,924 INFO [train.py:1039] (2/4) Epoch 12, batch 4000, loss[loss=0.1849, simple_loss=0.262, pruned_loss=0.05393, over 24319.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2659, pruned_loss=0.06194, over 4698811.63 frames. ], batch size: 61, lr: 8.56e-03, grad_scale: 16.0 2023-09-29 16:01:50,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=416220.0, ans=0.1 2023-09-29 16:01:52,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=416220.0, ans=0.125 2023-09-29 16:01:55,122 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.007e+02 2.286e+02 2.878e+02 4.961e+02, threshold=4.572e+02, percent-clipped=2.0 2023-09-29 16:01:55,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:01,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:08,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:02:08,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 16:02:09,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:02:10,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 16:02:10,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:02:10,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 16:02:13,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:13,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=416286.6666666667, ans=0.0 2023-09-29 16:02:14,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:02:14,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:02:14,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:02:16,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:16,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:02:18,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:02:21,475 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 16:02:21,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:02:23,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:25,417 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 16:02:27,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:02:27,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:29,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=416353.3333333333, ans=0.125 2023-09-29 16:02:32,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=416353.3333333333, ans=0.05 2023-09-29 16:02:32,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=416353.3333333333, ans=0.125 2023-09-29 16:02:35,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=416353.3333333333, ans=0.2 2023-09-29 16:02:37,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 16:02:37,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:40,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:02:41,880 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 16:02:43,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:02:43,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 16:02:43,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:02:44,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:45,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:02:46,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:02:46,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:02:46,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:47,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=416420.0, ans=0.125 2023-09-29 16:02:49,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 16:02:49,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:51,408 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 16:02:56,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=416486.6666666667, ans=0.015 2023-09-29 16:02:56,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=416486.6666666667, ans=0.1 2023-09-29 16:02:58,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:02:58,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=416486.6666666667, ans=0.2 2023-09-29 16:03:01,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 16:03:04,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:03:04,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:04,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:03:08,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:08,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.08 vs. limit=22.5 2023-09-29 16:03:10,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=416553.3333333333, ans=0.1 2023-09-29 16:03:11,744 INFO [train.py:1039] (2/4) Epoch 12, batch 4050, loss[loss=0.1978, simple_loss=0.2686, pruned_loss=0.0635, over 23715.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2675, pruned_loss=0.06271, over 4696615.33 frames. ], batch size: 149, lr: 8.55e-03, grad_scale: 16.0 2023-09-29 16:03:11,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:14,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:03:14,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 16:03:16,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:03:16,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:18,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:03:19,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:21,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:24,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=416553.3333333333, ans=0.0 2023-09-29 16:03:25,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:27,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:03:29,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 16:03:30,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:03:31,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:03:36,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:38,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:41,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:03:42,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 16:03:44,869 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 16:03:46,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=416686.6666666667, ans=0.07 2023-09-29 16:03:47,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:03:49,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=416686.6666666667, ans=0.125 2023-09-29 16:03:52,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 16:03:53,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:03:56,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:59,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:04:01,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:01,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:04:04,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:04:07,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 16:04:07,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:04:09,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:11,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 16:04:13,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=416753.3333333333, ans=0.1 2023-09-29 16:04:17,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:23,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 16:04:24,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:04:24,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:04:26,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 16:04:26,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 16:04:26,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:26,605 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:04:29,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:04:30,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:30,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:04:34,241 INFO [train.py:1039] (2/4) Epoch 12, batch 4100, loss[loss=0.2207, simple_loss=0.2823, pruned_loss=0.07955, over 23689.00 frames. ], tot_loss[loss=0.1975, simple_loss=0.2684, pruned_loss=0.06332, over 4704011.19 frames. ], batch size: 232, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:04:37,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 16:04:39,346 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:04:40,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 16:04:40,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 16:04:42,525 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.020e+02 2.338e+02 2.754e+02 3.996e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 16:04:42,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 16:04:42,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:44,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:44,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:45,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:04:47,227 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 16:04:49,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:50,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:04:51,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:52,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:04:55,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:04:56,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=416953.3333333333, ans=0.125 2023-09-29 16:04:57,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:57,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:04:57,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 16:04:58,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:58,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:04:59,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:04:59,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:59,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 16:05:00,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=416953.3333333333, ans=0.025 2023-09-29 16:05:02,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=416953.3333333333, ans=0.125 2023-09-29 16:05:04,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:05,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 16:05:07,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:05:08,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:05:08,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 16:05:10,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:05:10,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:05:11,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:05:13,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 16:05:15,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:05:16,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:05:18,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 16:05:20,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:05:20,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:23,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:29,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:32,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:34,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:05:42,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:05:42,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:47,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=417153.3333333333, ans=0.2 2023-09-29 16:05:47,912 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.91 vs. limit=15.0 2023-09-29 16:05:48,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:50,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:05:52,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:53,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:05:55,455 INFO [train.py:1039] (2/4) Epoch 12, batch 4150, loss[loss=0.2007, simple_loss=0.261, pruned_loss=0.07021, over 23476.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2677, pruned_loss=0.06278, over 4713319.83 frames. ], batch size: 134, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:05:55,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:05:55,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:05:58,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 16:05:59,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:59,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 16:06:00,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 16:06:01,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 16:06:02,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:06:06,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:06:06,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:08,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=417220.0, ans=0.125 2023-09-29 16:06:11,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:12,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:14,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:06:16,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:06:17,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:06:17,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:06:17,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=417286.6666666667, ans=0.0 2023-09-29 16:06:22,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:26,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=417286.6666666667, ans=0.1 2023-09-29 16:06:27,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:29,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 16:06:29,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 16:06:29,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:06:31,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 16:06:31,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:06:31,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:35,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:36,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:41,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 16:06:46,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:06:46,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:06:47,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 16:06:48,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:50,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 16:06:52,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:06:55,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:55,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:57,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 16:06:57,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:57,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:07:00,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:07:03,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 16:07:03,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:03,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:07:05,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:07:06,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 16:07:06,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:07:06,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 16:07:08,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:07:10,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:10,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 16:07:10,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:07:12,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=417486.6666666667, ans=0.125 2023-09-29 16:07:16,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:07:18,211 INFO [train.py:1039] (2/4) Epoch 12, batch 4200, loss[loss=0.1861, simple_loss=0.27, pruned_loss=0.05114, over 24448.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2669, pruned_loss=0.06271, over 4709493.71 frames. ], batch size: 69, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:07:18,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 16:07:20,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:07:22,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:25,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:07:26,353 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.937e+02 2.271e+02 2.682e+02 4.339e+02, threshold=4.541e+02, percent-clipped=0.0 2023-09-29 16:07:26,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:26,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:29,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 16:07:32,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 16:07:32,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:35,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:37,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:07:39,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:07:41,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:07:42,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:42,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 16:07:42,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:45,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:45,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:45,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:07:49,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:07:49,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 16:07:49,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:51,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=417686.6666666667, ans=0.0 2023-09-29 16:07:51,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=417686.6666666667, ans=0.2 2023-09-29 16:07:53,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=417686.6666666667, ans=0.125 2023-09-29 16:07:54,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:07:56,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:07:59,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:08:00,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:02,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:08:02,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 16:08:02,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:05,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:08:08,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:08:10,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:17,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:08:18,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 16:08:20,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:27,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:08:27,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:30,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 16:08:34,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:08:35,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=417820.0, ans=0.125 2023-09-29 16:08:39,279 INFO [train.py:1039] (2/4) Epoch 12, batch 4250, loss[loss=0.2145, simple_loss=0.2908, pruned_loss=0.06912, over 24629.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2657, pruned_loss=0.06208, over 4708491.25 frames. ], batch size: 68, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:08:39,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:39,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:08:41,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:48,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:08:49,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 16:08:49,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:51,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:57,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:09:01,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=417953.3333333333, ans=0.0 2023-09-29 16:09:02,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:02,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:04,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:09:04,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:05,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:05,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:07,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:09,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:09:10,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:12,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 16:09:16,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 16:09:18,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:19,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:09:19,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:21,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:09:21,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:21,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:23,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=418020.0, ans=0.125 2023-09-29 16:09:26,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:09:26,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:09:31,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:33,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:33,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 16:09:34,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:09:34,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 16:09:36,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:09:37,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:09:39,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=418086.6666666667, ans=0.0 2023-09-29 16:09:40,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:40,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:43,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 16:09:44,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:09:45,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:09:46,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=418153.3333333333, ans=0.2 2023-09-29 16:09:50,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:50,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=418153.3333333333, ans=0.125 2023-09-29 16:09:51,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:54,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:09:56,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:57,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:09:59,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:09:59,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:09:59,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 16:10:01,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:03,570 INFO [train.py:1039] (2/4) Epoch 12, batch 4300, loss[loss=0.1752, simple_loss=0.2478, pruned_loss=0.05128, over 24330.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2656, pruned_loss=0.06195, over 4694583.11 frames. ], batch size: 56, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:10:08,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:10:08,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:11,195 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.977e+02 2.264e+02 2.605e+02 3.860e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 16:10:11,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:17,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=418286.6666666667, ans=0.1 2023-09-29 16:10:18,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:10:18,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 16:10:20,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:10:22,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:10:22,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:10:22,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 16:10:25,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:10:29,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:10:32,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 16:10:32,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:10:34,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 16:10:34,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=418353.3333333333, ans=0.0 2023-09-29 16:10:36,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:10:38,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:10:42,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:10:42,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:42,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:10:44,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:45,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:10:45,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 16:10:45,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 16:10:48,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:10:50,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:10:50,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:50,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 16:10:50,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 16:10:52,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 16:10:53,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:10:53,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 16:10:53,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 16:10:55,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:10:57,051 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 16:10:59,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:11:02,158 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.70 vs. limit=15.0 2023-09-29 16:11:02,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:02,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:11:04,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 16:11:04,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=418420.0, ans=0.0 2023-09-29 16:11:06,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:11:06,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:06,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:06,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:08,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:11:09,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:11:13,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:13,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:14,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:20,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 16:11:22,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:11:25,275 INFO [train.py:1039] (2/4) Epoch 12, batch 4350, loss[loss=0.2024, simple_loss=0.2707, pruned_loss=0.0671, over 23656.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2664, pruned_loss=0.06232, over 4695765.67 frames. ], batch size: 149, lr: 8.53e-03, grad_scale: 8.0 2023-09-29 16:11:25,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:28,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:11:30,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:11:34,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:11:39,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:40,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=418620.0, ans=0.1 2023-09-29 16:11:43,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:11:44,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:46,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:11:49,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:11:50,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:11:55,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=418620.0, ans=0.1 2023-09-29 16:11:56,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 16:11:57,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:58,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:00,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=418686.6666666667, ans=0.0 2023-09-29 16:12:00,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=418686.6666666667, ans=0.0 2023-09-29 16:12:04,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:07,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 16:12:11,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:12,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:12:18,173 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 16:12:19,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:19,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:12:20,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=418753.3333333333, ans=0.1 2023-09-29 16:12:21,234 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 16:12:21,348 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 16:12:21,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:22,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:12:22,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:12:24,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:25,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:25,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:12:30,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 16:12:30,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:30,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 16:12:31,916 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 16:12:31,924 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 16:12:31,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 16:12:35,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:12:35,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:12:35,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:12:36,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:12:38,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 16:12:42,037 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 16:12:42,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:42,178 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:12:47,023 INFO [train.py:1039] (2/4) Epoch 12, batch 4400, loss[loss=0.2018, simple_loss=0.2787, pruned_loss=0.06247, over 23941.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2679, pruned_loss=0.06295, over 4703430.08 frames. ], batch size: 86, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:12:47,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:47,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:50,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:50,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 16:12:52,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 16:12:52,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 16:12:52,360 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 16:12:54,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:12:54,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:55,958 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 1.963e+02 2.169e+02 2.661e+02 4.171e+02, threshold=4.339e+02, percent-clipped=0.0 2023-09-29 16:12:56,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 16:12:59,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:00,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:00,904 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 16:13:05,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:05,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 16:13:05,444 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 16:13:05,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=418953.3333333333, ans=0.0 2023-09-29 16:13:09,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 16:13:10,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 16:13:10,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 16:13:10,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:11,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:13,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:14,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:16,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 16:13:16,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 16:13:17,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:18,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=419020.0, ans=0.0 2023-09-29 16:13:19,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:13:19,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:21,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:23,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:23,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 16:13:24,603 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 16:13:27,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:35,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:36,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 16:13:41,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:13:43,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:13:47,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:13:47,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 16:13:47,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:13:47,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:13:47,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:13:49,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:13:54,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 16:13:57,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 16:13:57,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=419153.3333333333, ans=0.125 2023-09-29 16:13:59,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 16:13:59,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:59,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 16:14:01,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:14:04,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:14:06,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 16:14:09,923 INFO [train.py:1039] (2/4) Epoch 12, batch 4450, loss[loss=0.1867, simple_loss=0.2539, pruned_loss=0.0598, over 23491.00 frames. ], tot_loss[loss=0.1976, simple_loss=0.2682, pruned_loss=0.06346, over 4697422.61 frames. ], batch size: 134, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:14:10,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=419220.0, ans=0.125 2023-09-29 16:14:12,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:14:14,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:14,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:14:18,237 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.51 vs. limit=15.0 2023-09-29 16:14:23,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:24,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:14:27,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:28,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:14:32,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:14:33,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:35,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 16:14:35,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:36,281 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.70 vs. limit=10.0 2023-09-29 16:14:37,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:37,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:14:37,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:14:38,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:14:40,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=419353.3333333333, ans=0.1 2023-09-29 16:14:45,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:45,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:47,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:47,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:49,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:14:53,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:14:55,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 16:14:56,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 16:14:56,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:14:57,135 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:14:58,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:58,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 16:15:02,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:15:05,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=419420.0, ans=0.125 2023-09-29 16:15:06,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:08,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 16:15:08,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:08,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:10,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:15:10,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:15:10,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:13,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:15:15,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 16:15:17,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:15:19,775 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.40 vs. limit=6.0 2023-09-29 16:15:20,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:15:22,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:23,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:23,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:15:25,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:15:27,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=419486.6666666667, ans=0.0 2023-09-29 16:15:27,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=419486.6666666667, ans=0.0 2023-09-29 16:15:28,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 16:15:30,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:15:31,785 INFO [train.py:1039] (2/4) Epoch 12, batch 4500, loss[loss=0.2113, simple_loss=0.2735, pruned_loss=0.07451, over 23772.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2693, pruned_loss=0.06346, over 4696548.27 frames. ], batch size: 212, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:15:32,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=419553.3333333333, ans=0.0 2023-09-29 16:15:35,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:37,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 16:15:37,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 16:15:38,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:15:40,292 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.947e+02 2.224e+02 2.499e+02 3.956e+02, threshold=4.448e+02, percent-clipped=0.0 2023-09-29 16:15:40,995 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:15:44,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:44,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:45,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:15:47,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:15:47,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:15:47,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:16:00,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:16:00,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:16:03,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:04,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:16:05,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:16:13,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:16:18,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:16:23,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:16:27,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:16:27,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 16:16:27,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:27,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:33,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:16:34,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 16:16:34,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:16:34,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:39,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:16:39,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:16:39,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=419820.0, ans=0.2 2023-09-29 16:16:43,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:45,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:16:45,693 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.45 vs. limit=15.0 2023-09-29 16:16:46,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:16:48,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 16:16:50,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 16:16:50,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 16:16:54,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 16:16:56,291 INFO [train.py:1039] (2/4) Epoch 12, batch 4550, loss[loss=0.2333, simple_loss=0.2944, pruned_loss=0.08606, over 23283.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2686, pruned_loss=0.06273, over 4710140.10 frames. ], batch size: 105, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:16:59,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 16:16:59,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:02,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:04,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:07,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:08,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=419886.6666666667, ans=0.025 2023-09-29 16:17:10,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:17:13,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:17:15,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:15,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:17:15,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:18,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:18,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:22,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:17:24,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 16:17:26,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 16:17:26,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:17:28,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 16:17:33,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 16:17:34,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:36,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 16:17:37,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:17:40,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=420020.0, ans=0.125 2023-09-29 16:17:42,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:17:45,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 16:17:46,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:17:49,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:49,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:52,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:53,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 16:17:55,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 16:17:55,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:17:57,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 16:17:57,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 16:17:57,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:59,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:59,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:01,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:01,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:18:04,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:18:04,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 16:18:06,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:18:06,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:18:06,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=420153.3333333333, ans=0.05 2023-09-29 16:18:07,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 16:18:07,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:18:07,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 16:18:10,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:18:11,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:18:12,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=420153.3333333333, ans=0.125 2023-09-29 16:18:14,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:18:14,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:14,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:18:17,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:18:18,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:18:19,364 INFO [train.py:1039] (2/4) Epoch 12, batch 4600, loss[loss=0.2186, simple_loss=0.2751, pruned_loss=0.08107, over 23793.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2668, pruned_loss=0.06242, over 4699013.76 frames. ], batch size: 212, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:18:22,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:23,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:24,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=420220.0, ans=0.05 2023-09-29 16:18:24,679 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.86 vs. limit=15.0 2023-09-29 16:18:25,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:18:25,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:18:26,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.98 vs. limit=15.0 2023-09-29 16:18:26,889 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.954e+02 2.198e+02 2.471e+02 4.636e+02, threshold=4.396e+02, percent-clipped=1.0 2023-09-29 16:18:27,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:27,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 16:18:29,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:18:36,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:18:36,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:36,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=420286.6666666667, ans=0.05 2023-09-29 16:18:38,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=420286.6666666667, ans=0.125 2023-09-29 16:18:39,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:47,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 16:18:47,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:50,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:52,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:18:52,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:57,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 16:18:57,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:18:59,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:19:04,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:04,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:19:04,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=420353.3333333333, ans=0.2 2023-09-29 16:19:06,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:19:11,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 16:19:13,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:19:18,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:19,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:19:22,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:22,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 16:19:22,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:23,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 16:19:24,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:24,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:26,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:26,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:19:27,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:29,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 16:19:30,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 16:19:30,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 16:19:30,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:32,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:32,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:34,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:34,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=420486.6666666667, ans=0.5 2023-09-29 16:19:43,631 INFO [train.py:1039] (2/4) Epoch 12, batch 4650, loss[loss=0.1694, simple_loss=0.2495, pruned_loss=0.04466, over 24448.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2674, pruned_loss=0.06239, over 4707597.21 frames. ], batch size: 63, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:19:45,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:19:48,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=420553.3333333333, ans=0.125 2023-09-29 16:19:49,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:19:49,813 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=420553.3333333333, ans=0.125 2023-09-29 16:19:51,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:51,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:19:51,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:52,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:54,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:57,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 16:20:01,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:20:03,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 16:20:03,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:20:03,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 16:20:05,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:20:05,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 16:20:05,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 16:20:05,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:07,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:20:10,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:20:12,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:12,401 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 16:20:16,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:17,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 16:20:20,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:22,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:20:22,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 16:20:24,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:20:27,990 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.46 vs. limit=22.5 2023-09-29 16:20:28,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:20:30,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:35,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:39,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:20:44,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 16:20:44,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 16:20:44,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 16:20:44,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 16:20:46,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:20:55,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:20:55,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:20:55,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 16:20:55,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:55,858 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.92 vs. limit=22.5 2023-09-29 16:20:58,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:20:58,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:21:00,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:21:03,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:21:03,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:21:03,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:21:04,939 INFO [train.py:1039] (2/4) Epoch 12, batch 4700, loss[loss=0.1764, simple_loss=0.2505, pruned_loss=0.05116, over 21511.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2677, pruned_loss=0.06209, over 4717639.11 frames. ], batch size: 47, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:21:05,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=420886.6666666667, ans=0.0 2023-09-29 16:21:05,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.64 vs. limit=15.0 2023-09-29 16:21:09,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:09,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:21:09,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=420886.6666666667, ans=0.1 2023-09-29 16:21:10,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:21:11,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 16:21:11,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:21:12,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 16:21:14,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.872e+02 2.064e+02 2.331e+02 3.087e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-29 16:21:20,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:21,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:23,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:21:23,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:24,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:21:29,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 16:21:29,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 16:21:33,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:33,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:21:33,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=420953.3333333333, ans=10.0 2023-09-29 16:21:34,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:21:36,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=421020.0, ans=10.0 2023-09-29 16:21:37,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:41,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=421020.0, ans=0.0 2023-09-29 16:21:42,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=421020.0, ans=0.0 2023-09-29 16:21:45,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:21:45,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:21:48,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:55,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 16:21:57,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:22:00,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:04,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 16:22:04,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:22:09,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:22:09,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 16:22:11,624 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.17 vs. limit=10.0 2023-09-29 16:22:12,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:12,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:14,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:22:15,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:22:15,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 16:22:17,095 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 16:22:18,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:20,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 16:22:21,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:25,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 16:22:27,036 INFO [train.py:1039] (2/4) Epoch 12, batch 4750, loss[loss=0.2123, simple_loss=0.2802, pruned_loss=0.07224, over 23406.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.268, pruned_loss=0.06231, over 4711370.48 frames. ], batch size: 93, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:22:28,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:22:30,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:22:35,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=421220.0, ans=0.125 2023-09-29 16:22:36,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 16:22:37,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:22:41,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 16:22:42,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:22:44,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:44,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:22:49,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 16:22:53,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:22:55,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 16:22:55,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:23:01,674 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.01 vs. limit=15.0 2023-09-29 16:23:02,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:04,419 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 16:23:04,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 16:23:10,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=421353.3333333333, ans=0.04949747468305833 2023-09-29 16:23:11,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 16:23:14,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:14,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:16,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:23:16,596 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 16:23:16,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:19,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:23:22,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:23:24,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 16:23:24,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=421420.0, ans=0.0 2023-09-29 16:23:25,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 16:23:26,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:27,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:23:27,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:29,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:23:29,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 16:23:33,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 16:23:36,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:23:40,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:23:40,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 16:23:40,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:23:41,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:43,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:23:45,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:45,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:23:46,549 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.97 vs. limit=15.0 2023-09-29 16:23:50,772 INFO [train.py:1039] (2/4) Epoch 12, batch 4800, loss[loss=0.1813, simple_loss=0.2627, pruned_loss=0.04995, over 24463.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2694, pruned_loss=0.06355, over 4709084.16 frames. ], batch size: 66, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:23:50,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:50,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 16:23:52,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 16:23:52,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 16:23:55,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:23:55,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:57,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 16:23:59,997 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.054e+02 2.346e+02 2.832e+02 5.942e+02, threshold=4.692e+02, percent-clipped=5.0 2023-09-29 16:24:03,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:04,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:04,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=421620.0, ans=0.125 2023-09-29 16:24:07,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:24:08,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:08,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=421620.0, ans=0.2 2023-09-29 16:24:09,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:10,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 16:24:10,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:24:11,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:24:11,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:24:17,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:18,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:20,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:24:20,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:21,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:24:21,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:23,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:25,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:26,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=421686.6666666667, ans=0.0 2023-09-29 16:24:28,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:29,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:30,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:24:31,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:24:34,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:34,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=421686.6666666667, ans=0.0 2023-09-29 16:24:36,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 16:24:36,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 16:24:36,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:36,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:24:37,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:24:37,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:37,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:24:39,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:24:39,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:41,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=421753.3333333333, ans=0.0 2023-09-29 16:24:43,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:24:46,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:48,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:24:53,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 16:24:53,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:55,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:55,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:24:56,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:25:01,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:25:01,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:25:01,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:01,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:25:02,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.80 vs. limit=22.5 2023-09-29 16:25:02,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:25:02,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:25:04,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=421820.0, ans=0.0 2023-09-29 16:25:07,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:08,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:08,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:25:10,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 16:25:11,794 INFO [train.py:1039] (2/4) Epoch 12, batch 4850, loss[loss=0.2097, simple_loss=0.2965, pruned_loss=0.0615, over 24643.00 frames. ], tot_loss[loss=0.1987, simple_loss=0.2701, pruned_loss=0.06367, over 4715383.51 frames. ], batch size: 68, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:25:12,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 16:25:12,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:12,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:13,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:13,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:16,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:25:16,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=421886.6666666667, ans=0.0 2023-09-29 16:25:24,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 16:25:24,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=421886.6666666667, ans=0.0 2023-09-29 16:25:27,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:33,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:33,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:25:34,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:37,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:39,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:25:40,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:25:42,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 16:25:45,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:48,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:25:48,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:25:48,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:25:48,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 16:25:50,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:50,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 16:25:55,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 16:25:57,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:26:06,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:26:07,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 16:26:09,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:26:09,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:26:12,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:26:12,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 16:26:12,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:13,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 16:26:13,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:15,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:16,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 16:26:24,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:30,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:26:32,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:26:32,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=19.30 vs. limit=15.0 2023-09-29 16:26:34,910 INFO [train.py:1039] (2/4) Epoch 12, batch 4900, loss[loss=0.1773, simple_loss=0.2514, pruned_loss=0.05164, over 23463.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2689, pruned_loss=0.0633, over 4699037.61 frames. ], batch size: 119, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:26:38,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 16:26:38,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:26:43,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:43,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:44,648 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 2.050e+02 2.285e+02 2.620e+02 3.714e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 16:26:44,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:26:48,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 16:26:52,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 16:26:57,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 16:26:58,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-09-29 16:26:58,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 16:26:58,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:26:58,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:27:00,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:00,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:00,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:27:00,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 16:27:05,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 16:27:06,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:27:08,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:27:08,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:27:08,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.50 vs. limit=15.0 2023-09-29 16:27:12,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:27:12,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:12,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=422353.3333333333, ans=0.2 2023-09-29 16:27:13,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:13,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 16:27:15,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:27:16,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:18,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 16:27:18,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 16:27:21,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 16:27:22,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:27:23,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=422420.0, ans=0.125 2023-09-29 16:27:24,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:27:24,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:27:25,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:25,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:27:25,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:27:25,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 16:27:26,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=422420.0, ans=15.0 2023-09-29 16:27:28,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:31,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:27:33,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:27:36,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 16:27:38,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:27:39,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:27:39,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 16:27:39,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=422486.6666666667, ans=0.0 2023-09-29 16:27:46,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:48,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:27:49,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 16:27:49,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:27:49,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:27:49,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:50,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=422486.6666666667, ans=0.125 2023-09-29 16:27:54,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:54,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:27:54,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:54,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 16:27:57,521 INFO [train.py:1039] (2/4) Epoch 12, batch 4950, loss[loss=0.2007, simple_loss=0.2875, pruned_loss=0.05693, over 24438.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2659, pruned_loss=0.06228, over 4699912.75 frames. ], batch size: 69, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:27:57,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:27:59,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:00,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:28:03,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 16:28:03,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 16:28:05,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:28:06,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 16:28:06,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:06,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:28:06,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:28:06,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:07,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=422553.3333333333, ans=0.125 2023-09-29 16:28:08,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:11,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:28:12,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:28:12,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:15,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:15,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:28:18,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=422620.0, ans=0.0 2023-09-29 16:28:19,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:28:25,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:27,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:28:30,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:30,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:32,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:28:33,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 16:28:35,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 16:28:38,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:41,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:28:41,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:28:43,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:28:43,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:28:43,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:28:45,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:46,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:28:50,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:28:52,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:53,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:54,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 16:28:54,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:28:56,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:28:59,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:00,139 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=12.0 2023-09-29 16:29:00,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:29:00,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:29:00,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:01,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=422753.3333333333, ans=0.025 2023-09-29 16:29:02,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:29:02,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:29:05,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:29:05,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:29:06,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:29:08,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 16:29:11,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:11,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=422820.0, ans=0.07 2023-09-29 16:29:16,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 16:29:18,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:29:20,389 INFO [train.py:1039] (2/4) Epoch 12, batch 5000, loss[loss=0.1883, simple_loss=0.2584, pruned_loss=0.05905, over 23484.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2657, pruned_loss=0.06197, over 4715011.81 frames. ], batch size: 134, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:29:26,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:26,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:27,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 16:29:28,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 16:29:31,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:29:32,531 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.922e+02 2.238e+02 2.801e+02 3.922e+02, threshold=4.477e+02, percent-clipped=0.0 2023-09-29 16:29:32,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 16:29:32,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:29:32,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:29:34,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 16:29:35,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:35,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:29:36,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=422953.3333333333, ans=0.125 2023-09-29 16:29:37,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 16:29:37,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:37,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:29:39,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 16:29:40,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 16:29:40,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:29:40,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 16:29:40,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:29:42,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:42,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:29:42,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 16:29:42,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 16:29:43,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 16:29:45,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:45,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:46,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 16:29:46,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:49,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:51,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:54,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:29:57,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 16:29:57,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:59,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:30:03,849 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 16:30:06,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:30:08,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:30:08,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:11,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 16:30:11,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:30:12,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:13,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:13,957 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.33 vs. limit=15.0 2023-09-29 16:30:14,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 16:30:14,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:25,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 16:30:25,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=423153.3333333333, ans=0.125 2023-09-29 16:30:31,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:40,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:41,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:41,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:30:41,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:41,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:30:43,039 INFO [train.py:1039] (2/4) Epoch 12, batch 5050, loss[loss=0.2164, simple_loss=0.2753, pruned_loss=0.07872, over 22802.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2664, pruned_loss=0.06186, over 4705006.79 frames. ], batch size: 322, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:30:43,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:30:43,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 16:30:49,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:30:51,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:52,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:30:54,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 16:30:54,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:55,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:57,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:30:58,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:30:58,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:31:09,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 16:31:09,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:31:11,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:11,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 16:31:12,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:14,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:15,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:31:17,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:31:17,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 16:31:18,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 16:31:20,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:21,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:22,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=423353.3333333333, ans=0.0 2023-09-29 16:31:22,619 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.93 vs. limit=22.5 2023-09-29 16:31:23,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:24,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 16:31:26,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:29,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 16:31:32,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:31:32,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:31:34,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:35,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:35,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:31:39,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:31:39,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:39,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:31:41,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:31:41,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 16:31:41,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:31:43,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:46,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:46,200 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 16:31:46,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:31:46,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=423486.6666666667, ans=0.125 2023-09-29 16:31:47,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:31:49,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:49,295 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 16:31:52,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:52,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 16:31:52,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:54,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=423486.6666666667, ans=0.125 2023-09-29 16:31:56,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:56,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:58,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 16:31:58,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 16:32:01,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:01,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:01,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:32:03,310 INFO [train.py:1039] (2/4) Epoch 12, batch 5100, loss[loss=0.199, simple_loss=0.2864, pruned_loss=0.05581, over 24574.00 frames. ], tot_loss[loss=0.1953, simple_loss=0.2674, pruned_loss=0.06162, over 4725740.92 frames. ], batch size: 71, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:32:04,930 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 16:32:06,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:32:06,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=423553.3333333333, ans=0.125 2023-09-29 16:32:10,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 16:32:10,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 16:32:12,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:15,484 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.923e+02 2.120e+02 2.504e+02 4.528e+02, threshold=4.241e+02, percent-clipped=1.0 2023-09-29 16:32:15,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:32:18,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:32:20,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 16:32:20,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 16:32:25,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:32:25,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:32:25,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=423620.0, ans=0.125 2023-09-29 16:32:28,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:30,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 16:32:30,747 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.09 vs. limit=15.0 2023-09-29 16:32:31,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:33,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:32:33,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:32:36,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:37,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:37,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 16:32:39,302 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 16:32:41,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:42,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 16:32:42,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 16:32:47,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:56,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:32:58,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 16:32:58,517 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 16:32:58,540 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 16:33:00,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 16:33:00,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:33:03,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 16:33:07,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 16:33:10,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:33:12,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:33:16,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 16:33:18,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:33:18,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 16:33:24,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:33:24,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:33:24,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:33:25,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=423886.6666666667, ans=0.0 2023-09-29 16:33:26,690 INFO [train.py:1039] (2/4) Epoch 12, batch 5150, loss[loss=0.2053, simple_loss=0.2746, pruned_loss=0.06797, over 23145.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.2685, pruned_loss=0.06218, over 4710250.81 frames. ], batch size: 105, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:33:26,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:33:26,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:33:28,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:33:29,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 16:33:29,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 16:33:31,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 16:33:31,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:33:31,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 16:33:32,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:32,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:33:34,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:36,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:41,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:33:41,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 16:33:42,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:44,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:33:45,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:33:45,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:33:45,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:33:45,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:33:45,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:33:47,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 16:33:49,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:33:49,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:33:52,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:33:55,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=423953.3333333333, ans=22.5 2023-09-29 16:33:55,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 16:33:55,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:34:03,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:34:05,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 16:34:08,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:15,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:17,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:22,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:22,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:23,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 16:34:28,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:34:28,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:34:30,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:34:33,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:35,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:35,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 16:34:43,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:43,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:34:45,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:45,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:34:46,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:34:46,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:34:46,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:34:48,044 INFO [train.py:1039] (2/4) Epoch 12, batch 5200, loss[loss=0.1778, simple_loss=0.2573, pruned_loss=0.04909, over 24454.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2698, pruned_loss=0.06337, over 4708817.07 frames. ], batch size: 63, lr: 8.48e-03, grad_scale: 16.0 2023-09-29 16:34:48,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:34:51,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:34:54,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:34:55,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:58,797 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.937e+02 2.192e+02 2.501e+02 3.290e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 16:34:59,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 16:35:00,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:35:01,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:04,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=424286.6666666667, ans=0.0 2023-09-29 16:35:05,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:05,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:35:05,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:08,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 16:35:11,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:35:12,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:15,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 16:35:15,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=424286.6666666667, ans=0.025 2023-09-29 16:35:17,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=424286.6666666667, ans=0.2 2023-09-29 16:35:18,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:35:18,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:35:20,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 16:35:20,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 16:35:23,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 16:35:23,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:23,668 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 16:35:23,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:25,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:25,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:35:26,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.95 vs. limit=15.0 2023-09-29 16:35:26,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 16:35:26,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:35:29,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:32,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 16:35:32,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 16:35:34,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 16:35:38,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 16:35:40,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:35:40,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=424420.0, ans=0.1 2023-09-29 16:35:45,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:35:45,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:35:48,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 16:35:48,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:48,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 16:35:48,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:50,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:35:54,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:35:56,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:35:57,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:58,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=424486.6666666667, ans=0.1 2023-09-29 16:35:59,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:35:59,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:05,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:06,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 16:36:06,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:36:07,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:36:08,356 INFO [train.py:1039] (2/4) Epoch 12, batch 5250, loss[loss=0.2138, simple_loss=0.2848, pruned_loss=0.07142, over 24317.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2686, pruned_loss=0.063, over 4714679.44 frames. ], batch size: 61, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:36:08,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:09,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:36:11,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:36:15,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:36:17,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:17,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:36:18,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:36:25,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:26,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:36:26,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:36:28,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:36:30,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=424620.0, ans=0.125 2023-09-29 16:36:31,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 16:36:31,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:32,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:49,265 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:36:55,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=424753.3333333333, ans=0.1 2023-09-29 16:37:19,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=424820.0, ans=0.0 2023-09-29 16:37:21,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=424820.0, ans=0.125 2023-09-29 16:37:22,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=424886.6666666667, ans=0.125 2023-09-29 16:37:23,494 INFO [train.py:1039] (2/4) Epoch 12, batch 5300, loss[loss=0.1851, simple_loss=0.2412, pruned_loss=0.06455, over 23587.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2672, pruned_loss=0.0629, over 4716084.99 frames. ], batch size: 256, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:37:28,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.16 vs. limit=10.0 2023-09-29 16:37:33,065 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.874e+02 2.092e+02 2.441e+02 3.524e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 16:37:38,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:37:38,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 16:37:38,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 16:37:38,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:38,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:38,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:38,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:38,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:38,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:37:39,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:39,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:37:40,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:37:40,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 16:37:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 16:37:40,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 16:37:40,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:37:40,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 16:37:40,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 16:37:40,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:41,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:41,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:41,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:41,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:37:42,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:42,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:42,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:42,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:42,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:42,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:37:42,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:42,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:37:44,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 16:37:44,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:44,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:44,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 16:37:44,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 16:37:44,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:37:45,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:37:45,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 16:37:45,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 16:37:45,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:46,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:37:46,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:46,446 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 16:37:46,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 16:37:46,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:37:46,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:46,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 16:37:46,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 16:37:47,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 16:37:47,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:57,152 INFO [train.py:1039] (2/4) Epoch 13, batch 0, loss[loss=0.2019, simple_loss=0.2843, pruned_loss=0.05969, over 24658.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2843, pruned_loss=0.05969, over 24658.00 frames. ], batch size: 68, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:37:57,152 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 16:38:10,938 INFO [train.py:1071] (2/4) Epoch 13, validation: loss=0.2695, simple_loss=0.2756, pruned_loss=0.1317, over 1125622.00 frames. 2023-09-29 16:38:10,939 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 16:38:12,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 16:38:13,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=424966.6666666667, ans=0.0 2023-09-29 16:38:13,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=424966.6666666667, ans=0.0 2023-09-29 16:38:14,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:38:15,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:38:20,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:20,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:38:22,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:22,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 16:38:24,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 16:38:27,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:28,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:33,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:33,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:35,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:38:35,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:36,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 16:38:38,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:45,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:38:45,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:48,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 16:38:52,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:38:52,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:38:53,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:38:54,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=425100.0, ans=0.125 2023-09-29 16:38:59,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:39:04,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:10,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=425166.6666666667, ans=0.0 2023-09-29 16:39:11,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 16:39:14,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 16:39:14,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:14,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:15,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:39:16,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:39:18,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 16:39:19,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=425233.3333333333, ans=0.125 2023-09-29 16:39:22,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:24,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:27,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:39:30,515 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 16:39:32,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:39:34,214 INFO [train.py:1039] (2/4) Epoch 13, batch 50, loss[loss=0.1732, simple_loss=0.2556, pruned_loss=0.0454, over 24576.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2692, pruned_loss=0.06137, over 1070845.65 frames. ], batch size: 60, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:39:35,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:36,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=425300.0, ans=0.2 2023-09-29 16:39:38,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:38,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 16:39:39,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:39:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:39:40,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.24 vs. limit=15.0 2023-09-29 16:39:44,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:46,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:48,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:51,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 16:39:51,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:58,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:40:00,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 16:40:02,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 16:40:04,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:40:05,786 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=15.0 2023-09-29 16:40:06,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:06,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:06,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:06,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=425433.3333333333, ans=0.0 2023-09-29 16:40:08,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:40:10,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:40:10,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:17,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:18,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:18,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:40:20,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 16:40:23,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:40:24,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:40:24,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 16:40:25,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=425500.0, ans=0.125 2023-09-29 16:40:26,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:27,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 16:40:35,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:40:35,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:38,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:40,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:40,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:40,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=425566.6666666667, ans=0.0 2023-09-29 16:40:43,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 16:40:43,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 16:40:45,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:46,814 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.884e+02 2.162e+02 2.621e+02 5.674e+02, threshold=4.324e+02, percent-clipped=3.0 2023-09-29 16:40:46,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:48,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:48,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:48,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 16:40:50,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 16:40:50,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:40:53,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:53,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:40:54,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 16:40:54,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 16:40:54,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:55,943 INFO [train.py:1039] (2/4) Epoch 13, batch 100, loss[loss=0.2024, simple_loss=0.2821, pruned_loss=0.06133, over 24146.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2683, pruned_loss=0.06032, over 1898933.77 frames. ], batch size: 80, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:40:56,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:57,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:40:57,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:41:01,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:41:04,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:41:07,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:08,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 16:41:08,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:41:10,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=425700.0, ans=0.0 2023-09-29 16:41:13,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:41:13,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:13,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:41:13,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:41:13,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:14,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=425700.0, ans=0.0 2023-09-29 16:41:15,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 16:41:15,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:41:15,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:17,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:17,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:20,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 16:41:21,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=425700.0, ans=0.125 2023-09-29 16:41:22,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:24,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:25,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:41:28,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=425766.6666666667, ans=0.2 2023-09-29 16:41:29,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:41:32,689 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 16:41:32,714 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 16:41:34,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:41:34,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:41:39,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:41:42,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:42,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:45,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=425833.3333333333, ans=0.0 2023-09-29 16:41:48,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:49,748 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 16:41:51,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:41:53,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=425833.3333333333, ans=0.2 2023-09-29 16:41:56,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:41:56,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=425833.3333333333, ans=0.1 2023-09-29 16:41:57,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:42:01,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:04,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:06,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:08,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:42:10,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:12,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:14,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:14,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:42:14,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:16,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 16:42:16,364 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 16:42:17,684 INFO [train.py:1039] (2/4) Epoch 13, batch 150, loss[loss=0.1835, simple_loss=0.2651, pruned_loss=0.05093, over 24463.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.269, pruned_loss=0.06055, over 2536684.37 frames. ], batch size: 66, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:42:17,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:19,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:42:19,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:19,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:19,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:42:19,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:42:19,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:42:19,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:19,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=425966.6666666667, ans=0.125 2023-09-29 16:42:21,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:21,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=425966.6666666667, ans=0.1 2023-09-29 16:42:22,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:22,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=425966.6666666667, ans=0.125 2023-09-29 16:42:24,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:42:24,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:42:27,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:29,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:29,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:42:29,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=425966.6666666667, ans=0.1 2023-09-29 16:42:31,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:34,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:34,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:37,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:42:38,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:43,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 16:42:43,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 16:42:43,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 16:42:46,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:42:46,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:42:48,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:42:48,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:49,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:49,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:51,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:52,840 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 16:42:54,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:59,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:02,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:43:04,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 16:43:09,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:43:09,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:09,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:11,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:43:12,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:43:14,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:43:16,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:16,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 16:43:22,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:24,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:24,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:43:24,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:43:27,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:27,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 16:43:30,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:43:32,029 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.958e+02 2.151e+02 2.617e+02 4.145e+02, threshold=4.302e+02, percent-clipped=0.0 2023-09-29 16:43:32,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:43:33,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:35,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:43:35,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 16:43:35,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:37,431 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 16:43:40,253 INFO [train.py:1039] (2/4) Epoch 13, batch 200, loss[loss=0.2474, simple_loss=0.3003, pruned_loss=0.09723, over 22654.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2702, pruned_loss=0.06194, over 3023133.60 frames. ], batch size: 322, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:43:42,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:43:45,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:43:45,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:43:49,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 16:43:51,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:51,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:53,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 16:43:55,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:43:55,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=426366.6666666667, ans=0.2 2023-09-29 16:43:56,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:58,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:59,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=426366.6666666667, ans=0.1 2023-09-29 16:44:03,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:44:03,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:44:05,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:05,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=426366.6666666667, ans=0.025 2023-09-29 16:44:18,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=426433.3333333333, ans=0.5 2023-09-29 16:44:23,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:44:24,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:44:24,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:44:26,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:44:26,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 16:44:26,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:44:28,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:28,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:44:28,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:28,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:44:32,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 16:44:32,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:44:32,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:39,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:44:44,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:51,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:51,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:44:58,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:00,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 16:45:01,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:01,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:45:01,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:03,243 INFO [train.py:1039] (2/4) Epoch 13, batch 250, loss[loss=0.2048, simple_loss=0.2799, pruned_loss=0.06484, over 24449.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2691, pruned_loss=0.06134, over 3400853.29 frames. ], batch size: 63, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:45:03,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:45:03,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 16:45:05,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:45:05,680 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 16:45:08,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:10,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:45:10,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:15,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:16,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:45:17,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:18,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:45:23,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:45:27,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=426700.0, ans=0.5 2023-09-29 16:45:29,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=426700.0, ans=0.1 2023-09-29 16:45:35,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:37,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:37,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:45:42,131 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:45:45,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:45:46,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:45:48,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:45:48,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:48,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:45:48,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:45:50,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:50,881 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.08 vs. limit=15.0 2023-09-29 16:45:51,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=426766.6666666667, ans=0.125 2023-09-29 16:45:53,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:45:56,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 16:45:56,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:57,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:45:57,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:45:57,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:45:59,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:01,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:46:01,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:46:04,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:05,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=426833.3333333333, ans=0.0 2023-09-29 16:46:06,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:46:06,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:07,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=426833.3333333333, ans=0.125 2023-09-29 16:46:09,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:46:12,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=426900.0, ans=0.05 2023-09-29 16:46:13,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:15,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:46:16,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=426900.0, ans=0.0 2023-09-29 16:46:20,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:21,730 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.073e+02 2.477e+02 4.320e+02, threshold=4.145e+02, percent-clipped=1.0 2023-09-29 16:46:23,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:46:26,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 16:46:28,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:46:29,458 INFO [train.py:1039] (2/4) Epoch 13, batch 300, loss[loss=0.1888, simple_loss=0.2681, pruned_loss=0.05474, over 24438.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2672, pruned_loss=0.06133, over 3683277.05 frames. ], batch size: 69, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:46:29,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:31,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 16:46:32,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:46:32,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:46:32,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 16:46:39,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:40,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:46:43,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:46:43,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 16:46:45,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:45,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:46:45,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 16:46:45,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:46:50,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:46:55,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:46:56,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 16:47:01,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 16:47:01,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:02,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:05,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:05,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 16:47:05,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:47:09,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:47:12,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:47:12,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:15,216 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.06 vs. limit=22.5 2023-09-29 16:47:17,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:47:18,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 16:47:18,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:47:22,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:23,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 16:47:25,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:28,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:47:33,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:47:33,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 16:47:36,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:37,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:47:40,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:42,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:47:42,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 16:47:42,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:47:43,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:45,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 16:47:46,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:46,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:48,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:48,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:50,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:50,860 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=427300.0, ans=0.125 2023-09-29 16:47:51,852 INFO [train.py:1039] (2/4) Epoch 13, batch 350, loss[loss=0.1979, simple_loss=0.2557, pruned_loss=0.07005, over 23916.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2648, pruned_loss=0.06015, over 3922830.18 frames. ], batch size: 195, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:47:53,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=427300.0, ans=0.125 2023-09-29 16:47:55,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:47:55,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:47:58,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:00,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=427300.0, ans=0.0 2023-09-29 16:48:05,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:48:06,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:07,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:10,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 16:48:12,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:12,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 16:48:15,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:15,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 16:48:15,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:18,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 16:48:20,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:48:22,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:23,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:48:25,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:48:26,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:27,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:48:27,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=427433.3333333333, ans=0.2 2023-09-29 16:48:28,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:48:28,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:36,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:48:36,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:48:37,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:48:37,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:44,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 16:48:45,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:49,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:49,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:48:50,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:50,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 16:48:54,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:48:54,737 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 16:48:57,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 16:48:57,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:59,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=427566.6666666667, ans=0.2 2023-09-29 16:49:00,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:49:00,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 16:49:02,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:04,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:49:05,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:06,576 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.27 vs. limit=22.5 2023-09-29 16:49:07,243 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.930e+02 2.101e+02 2.393e+02 3.670e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-29 16:49:07,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:07,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:10,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:10,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=427566.6666666667, ans=0.0 2023-09-29 16:49:13,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:49:15,483 INFO [train.py:1039] (2/4) Epoch 13, batch 400, loss[loss=0.1843, simple_loss=0.2575, pruned_loss=0.05549, over 23580.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2648, pruned_loss=0.06024, over 4088598.65 frames. ], batch size: 120, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:49:15,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:49:17,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 16:49:17,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:17,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:19,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:49:20,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:24,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:26,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:28,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 16:49:30,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 16:49:30,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:32,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 16:49:32,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:38,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:49:38,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:38,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 16:49:39,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:49:39,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:39,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:40,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:43,540 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 16:49:43,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 16:49:45,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=427700.0, ans=0.125 2023-09-29 16:49:48,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:50,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:50,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:52,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 16:49:52,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 16:49:55,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:50:00,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:07,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 16:50:08,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:50:11,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 16:50:13,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:50:15,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:50:15,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 16:50:18,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:50:21,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:50:23,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:50:25,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:27,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 16:50:28,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:50:30,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 16:50:31,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:50:31,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:50:33,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 16:50:36,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:50:37,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:50:37,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=427966.6666666667, ans=0.1 2023-09-29 16:50:38,460 INFO [train.py:1039] (2/4) Epoch 13, batch 450, loss[loss=0.1868, simple_loss=0.2542, pruned_loss=0.05972, over 23687.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2657, pruned_loss=0.0605, over 4233689.10 frames. ], batch size: 149, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:50:38,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:50:40,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 16:50:40,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:50:40,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:50:41,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:50:43,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 16:50:43,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:50:43,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:50:46,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:50:58,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:58,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:00,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 16:51:01,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 16:51:03,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:51:08,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:09,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:13,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:13,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:18,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 16:51:18,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 16:51:21,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 16:51:21,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:51:22,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:23,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:51:23,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=428100.0, ans=0.125 2023-09-29 16:51:25,115 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 16:51:25,138 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 16:51:26,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:26,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:51:28,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:51:31,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:51:31,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:51:31,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 16:51:33,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 16:51:36,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:37,232 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:51:39,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:51:40,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:51:40,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=428166.6666666667, ans=0.0 2023-09-29 16:51:41,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 16:51:45,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:51:46,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 16:51:48,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 16:51:49,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:52,576 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.942e+02 2.289e+02 2.754e+02 3.873e+02, threshold=4.578e+02, percent-clipped=0.0 2023-09-29 16:51:55,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:51:58,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:00,615 INFO [train.py:1039] (2/4) Epoch 13, batch 500, loss[loss=0.1821, simple_loss=0.2589, pruned_loss=0.05271, over 24568.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2667, pruned_loss=0.0609, over 4339245.24 frames. ], batch size: 60, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:52:00,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:52:00,770 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 16:52:03,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:04,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=428300.0, ans=0.125 2023-09-29 16:52:05,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:52:05,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:05,436 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 16:52:07,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 16:52:07,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:10,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:52:15,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:52:17,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:52:20,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:20,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:20,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:33,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:33,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:52:33,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=428433.3333333333, ans=0.0 2023-09-29 16:52:34,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:52:34,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:34,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 16:52:35,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:52:38,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:52:39,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:52:39,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:52:39,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:40,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 16:52:43,569 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 16:52:49,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:52:50,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:53,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:52:54,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 16:52:58,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:53:00,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:03,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:03,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=428500.0, ans=0.0 2023-09-29 16:53:06,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:53:12,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:13,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=428566.6666666667, ans=0.125 2023-09-29 16:53:14,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 16:53:14,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:14,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:17,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 16:53:19,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:53:20,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:22,746 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.69 vs. limit=15.0 2023-09-29 16:53:23,489 INFO [train.py:1039] (2/4) Epoch 13, batch 550, loss[loss=0.204, simple_loss=0.2628, pruned_loss=0.07265, over 23715.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2678, pruned_loss=0.06215, over 4406301.84 frames. ], batch size: 232, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:53:25,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=428633.3333333333, ans=0.2 2023-09-29 16:53:26,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 16:53:28,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 16:53:30,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:30,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 16:53:31,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:53:31,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:31,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:53:33,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:53:36,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:37,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 16:53:39,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:53:43,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:53:43,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:47,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:53:47,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:52,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 16:53:54,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 16:53:54,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:54:03,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:54:03,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:03,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:54:07,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:07,945 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 16:54:08,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:54:09,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 16:54:12,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:12,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:54:12,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:54:14,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:15,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 16:54:15,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 16:54:17,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:19,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:54:19,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:54:19,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:54:20,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:54:22,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:54:25,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:54:25,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:25,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:54:27,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:54:29,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:31,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:54:32,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:34,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:54:34,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:54:36,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=428900.0, ans=0.0 2023-09-29 16:54:39,421 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.968e+02 2.209e+02 2.597e+02 3.344e+02, threshold=4.418e+02, percent-clipped=0.0 2023-09-29 16:54:39,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 16:54:44,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 16:54:44,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:54:44,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=428966.6666666667, ans=0.1 2023-09-29 16:54:45,774 INFO [train.py:1039] (2/4) Epoch 13, batch 600, loss[loss=0.2529, simple_loss=0.3057, pruned_loss=0.1, over 19705.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.268, pruned_loss=0.06254, over 4469892.69 frames. ], batch size: 388, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:54:45,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:54:45,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:52,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.60 vs. limit=22.5 2023-09-29 16:54:52,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:54:52,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:54:54,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 16:54:55,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:54:59,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:02,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:05,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 16:55:05,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:55:14,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 16:55:18,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:55:18,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:18,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:55:25,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:55:25,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:55:27,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:28,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=429100.0, ans=0.025 2023-09-29 16:55:33,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:55:33,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=429166.6666666667, ans=0.125 2023-09-29 16:55:39,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:39,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:39,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:43,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=429166.6666666667, ans=0.1 2023-09-29 16:55:48,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 16:55:49,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=429166.6666666667, ans=0.0 2023-09-29 16:55:54,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:55:54,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:55:58,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 16:55:58,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:56:00,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=429233.3333333333, ans=0.1 2023-09-29 16:56:01,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 16:56:03,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:56:03,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:56:08,410 INFO [train.py:1039] (2/4) Epoch 13, batch 650, loss[loss=0.1966, simple_loss=0.2718, pruned_loss=0.06069, over 24490.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2677, pruned_loss=0.06215, over 4524879.38 frames. ], batch size: 66, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:56:08,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:56:11,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:56:13,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:56:15,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:56:17,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:20,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 16:56:20,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:56:28,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:56:28,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:29,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:35,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 16:56:35,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:56:36,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:40,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:56:40,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 16:56:45,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:45,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:45,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:56:46,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:48,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:56:50,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:56:50,027 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 16:56:50,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:50,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:56:50,764 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-09-29 16:56:52,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=429433.3333333333, ans=0.0 2023-09-29 16:56:55,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:56,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:56:56,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:56:58,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:56:58,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 16:56:59,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:56:59,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:57:01,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:57:01,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:57:02,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:57:03,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 16:57:04,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 16:57:04,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:04,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:57:04,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:57:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:57:07,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:57:14,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:14,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:57:18,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:57:18,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=429566.6666666667, ans=0.125 2023-09-29 16:57:18,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=429566.6666666667, ans=0.1 2023-09-29 16:57:18,939 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.98 vs. limit=22.5 2023-09-29 16:57:19,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:21,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 16:57:21,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:24,837 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.054e+02 2.276e+02 2.735e+02 4.255e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 16:57:26,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=429566.6666666667, ans=0.0 2023-09-29 16:57:28,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:57:28,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:28,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:29,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:31,309 INFO [train.py:1039] (2/4) Epoch 13, batch 700, loss[loss=0.1706, simple_loss=0.2504, pruned_loss=0.0454, over 24343.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2665, pruned_loss=0.0613, over 4571022.24 frames. ], batch size: 61, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:57:34,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 16:57:35,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 16:57:39,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 16:57:40,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:42,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:57:45,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 16:57:49,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:49,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=429700.0, ans=0.1 2023-09-29 16:57:50,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:57:52,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:56,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:57:56,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:58:01,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:58:04,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:58:04,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:58:05,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 16:58:07,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 16:58:09,442 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:58:12,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:58:12,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:58:13,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:58:14,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.95 vs. limit=6.0 2023-09-29 16:58:18,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:58:18,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 16:58:24,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:24,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:58:24,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 16:58:28,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=429833.3333333333, ans=0.125 2023-09-29 16:58:29,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:58:29,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:32,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:58:38,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:58:39,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=429900.0, ans=0.5 2023-09-29 16:58:40,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 16:58:43,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 16:58:43,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 16:58:43,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=429900.0, ans=0.125 2023-09-29 16:58:45,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:50,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:58:50,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:58:53,030 INFO [train.py:1039] (2/4) Epoch 13, batch 750, loss[loss=0.2145, simple_loss=0.2539, pruned_loss=0.08758, over 19290.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2662, pruned_loss=0.06158, over 4605506.11 frames. ], batch size: 388, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 16:58:53,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:53,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 16:58:57,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 16:58:57,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 16:58:57,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 16:58:57,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 16:58:57,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 16:58:59,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:59:00,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 16:59:02,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:02,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=429966.6666666667, ans=0.125 2023-09-29 16:59:04,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:04,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=429966.6666666667, ans=0.1 2023-09-29 16:59:07,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:08,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:08,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:59:08,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:11,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:59:13,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:59:14,483 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.30 vs. limit=15.0 2023-09-29 16:59:15,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:59:16,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:16,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:16,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 16:59:18,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:59:19,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.15 vs. limit=15.0 2023-09-29 16:59:19,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:21,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:24,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:59:26,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 16:59:26,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:59:27,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 16:59:27,997 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 16:59:29,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 16:59:29,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:59:29,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:59:33,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:59:43,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:43,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:43,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:59:44,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:45,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:46,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 16:59:46,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:59:48,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:59:49,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:59:51,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:59:52,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 16:59:54,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:58,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:59,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:59:59,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:03,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:00:08,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 17:00:08,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:09,479 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.868e+02 2.099e+02 2.383e+02 3.939e+02, threshold=4.199e+02, percent-clipped=0.0 2023-09-29 17:00:09,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:10,406 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.82 vs. limit=15.0 2023-09-29 17:00:11,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:13,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:15,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:15,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:00:16,486 INFO [train.py:1039] (2/4) Epoch 13, batch 800, loss[loss=0.2048, simple_loss=0.2856, pruned_loss=0.06202, over 24383.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2669, pruned_loss=0.0617, over 4623980.86 frames. ], batch size: 77, lr: 8.09e-03, grad_scale: 32.0 2023-09-29 17:00:22,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=430300.0, ans=0.0 2023-09-29 17:00:24,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:24,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:25,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:25,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:25,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:26,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=430300.0, ans=0.0 2023-09-29 17:00:27,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:30,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:32,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=430366.6666666667, ans=0.125 2023-09-29 17:00:35,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:36,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:00:39,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 17:00:41,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:41,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:43,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:00:43,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:43,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 17:00:43,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:45,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 17:00:45,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=430366.6666666667, ans=0.0 2023-09-29 17:00:47,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:50,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:50,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:52,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:52,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=430433.3333333333, ans=0.07 2023-09-29 17:00:53,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:53,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:54,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=430433.3333333333, ans=0.125 2023-09-29 17:00:59,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:00:59,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:00:59,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 17:01:01,528 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 17:01:01,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 17:01:01,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:01:01,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:03,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:03,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:08,531 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 17:01:09,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 17:01:10,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:01:13,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:01:16,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:01:21,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:22,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 17:01:23,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:01:25,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 17:01:27,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=430566.6666666667, ans=0.125 2023-09-29 17:01:30,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:33,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:01:33,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 17:01:33,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:01:33,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=430566.6666666667, ans=0.125 2023-09-29 17:01:34,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:36,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 17:01:36,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:37,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=430633.3333333333, ans=0.125 2023-09-29 17:01:38,240 INFO [train.py:1039] (2/4) Epoch 13, batch 850, loss[loss=0.1838, simple_loss=0.2546, pruned_loss=0.05649, over 23472.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2671, pruned_loss=0.06201, over 4644762.33 frames. ], batch size: 120, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 17:01:38,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:01:39,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:41,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:01:42,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:45,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 17:01:45,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 17:01:45,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 17:01:47,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:47,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:01:49,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=430633.3333333333, ans=0.125 2023-09-29 17:01:51,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:52,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:52,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:01:56,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:56,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:56,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=430700.0, ans=0.125 2023-09-29 17:01:57,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 17:02:02,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 17:02:03,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:02:05,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 17:02:08,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 17:02:10,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 17:02:11,564 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.16 vs. limit=22.5 2023-09-29 17:02:13,565 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 17:02:13,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:13,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:02:13,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:02:17,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:18,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:20,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 17:02:21,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:21,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:22,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=430766.6666666667, ans=0.2 2023-09-29 17:02:24,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:02:24,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:02:27,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:02:27,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:02:27,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 17:02:31,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:02:31,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:32,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:02:32,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:34,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:37,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:38,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:02:40,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:02:40,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=430833.3333333333, ans=0.1 2023-09-29 17:02:41,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:02:41,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:02:51,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:02:51,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:53,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 17:02:53,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:02:53,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:54,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=430900.0, ans=0.125 2023-09-29 17:02:57,225 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 2.022e+02 2.303e+02 2.741e+02 5.777e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 17:02:57,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 17:03:01,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=430966.6666666667, ans=0.125 2023-09-29 17:03:02,572 INFO [train.py:1039] (2/4) Epoch 13, batch 900, loss[loss=0.2091, simple_loss=0.271, pruned_loss=0.07361, over 23789.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2678, pruned_loss=0.06219, over 4671179.49 frames. ], batch size: 179, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:03:05,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:03:07,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:07,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 17:03:10,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:03:10,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 17:03:11,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:03:13,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:03:13,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:14,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:03:14,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:03:22,581 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.86 vs. limit=15.0 2023-09-29 17:03:26,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:03:26,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:26,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:03:28,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=431033.3333333333, ans=0.1 2023-09-29 17:03:29,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:30,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=431033.3333333333, ans=0.0 2023-09-29 17:03:34,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 17:03:37,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:03:42,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:03:42,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:03:43,979 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 17:03:45,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 17:03:48,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=431166.6666666667, ans=0.07 2023-09-29 17:03:53,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:03:53,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:03:53,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:03:59,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:01,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:03,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 17:04:03,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:04:08,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 17:04:10,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:04:11,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:12,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:04:13,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:15,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=431233.3333333333, ans=0.0 2023-09-29 17:04:18,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 17:04:18,211 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 17:04:19,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:04:19,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 17:04:21,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:22,922 INFO [train.py:1039] (2/4) Epoch 13, batch 950, loss[loss=0.168, simple_loss=0.2405, pruned_loss=0.04773, over 24411.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.268, pruned_loss=0.06215, over 4683860.79 frames. ], batch size: 58, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:04:24,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 17:04:29,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:31,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:04:33,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=431300.0, ans=0.1 2023-09-29 17:04:38,163 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 17:04:40,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:41,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:04:43,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:43,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:04:43,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 17:04:45,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:04:45,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=431366.6666666667, ans=0.125 2023-09-29 17:04:47,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:47,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 17:04:48,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:53,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:53,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:53,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:54,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 17:04:56,076 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.28 vs. limit=15.0 2023-09-29 17:04:56,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:04:58,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:05:01,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:05:07,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:07,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:05:11,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 17:05:13,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=431500.0, ans=0.0 2023-09-29 17:05:14,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:05:14,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:05:15,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:15,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:15,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:05:19,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 17:05:19,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:05:24,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:24,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:26,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 17:05:26,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:26,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:05:26,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 17:05:31,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:05:34,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:34,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=431566.6666666667, ans=0.1 2023-09-29 17:05:39,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=431566.6666666667, ans=0.1 2023-09-29 17:05:40,323 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.121e+02 2.375e+02 2.805e+02 4.363e+02, threshold=4.749e+02, percent-clipped=0.0 2023-09-29 17:05:40,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:05:42,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 17:05:42,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 17:05:42,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=431566.6666666667, ans=0.1 2023-09-29 17:05:46,096 INFO [train.py:1039] (2/4) Epoch 13, batch 1000, loss[loss=0.2185, simple_loss=0.2886, pruned_loss=0.07425, over 24012.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2672, pruned_loss=0.06203, over 4694827.74 frames. ], batch size: 86, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:05:46,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:48,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 17:05:48,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:54,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:05:56,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 17:05:56,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 17:06:01,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:01,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:06:03,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:06,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 17:06:10,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 17:06:11,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 17:06:12,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:14,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 17:06:15,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 17:06:16,796 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.39 vs. limit=12.0 2023-09-29 17:06:17,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 17:06:17,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:19,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:22,521 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:06:27,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:28,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:06:28,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:30,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:30,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 17:06:31,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:32,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:06:33,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:33,487 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 17:06:33,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=431833.3333333333, ans=0.1 2023-09-29 17:06:38,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 17:06:39,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 17:06:41,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 17:06:43,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=431833.3333333333, ans=0.0 2023-09-29 17:06:44,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:06:45,142 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.76 vs. limit=12.0 2023-09-29 17:06:51,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:51,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:06:51,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:54,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:06:56,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 17:06:57,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:06:58,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 17:07:00,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 17:07:00,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:00,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:07:01,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:07:04,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:07:05,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=431900.0, ans=0.0 2023-09-29 17:07:07,640 INFO [train.py:1039] (2/4) Epoch 13, batch 1050, loss[loss=0.1928, simple_loss=0.2618, pruned_loss=0.06193, over 23642.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2662, pruned_loss=0.06137, over 4704385.78 frames. ], batch size: 149, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:07:07,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:11,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:07:11,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:07:14,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:07:16,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:18,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:19,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:07:20,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=431966.6666666667, ans=0.125 2023-09-29 17:07:21,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:07:24,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:07:25,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:07:26,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:07:26,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:07:27,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 17:07:28,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:28,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 17:07:28,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.72 vs. limit=6.0 2023-09-29 17:07:33,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:33,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 17:07:33,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:07:39,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:39,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=432100.0, ans=0.0 2023-09-29 17:07:40,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:07:41,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:42,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 17:07:44,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 17:07:44,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:46,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=432100.0, ans=0.0 2023-09-29 17:07:49,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 17:07:50,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 17:07:51,695 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.43 vs. limit=15.0 2023-09-29 17:07:52,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:55,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:07:57,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:07:57,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:07:57,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:08:01,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=8.23 vs. limit=12.0 2023-09-29 17:08:02,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:08:05,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 17:08:09,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 17:08:09,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 17:08:09,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:09,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:08:10,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 17:08:14,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=432233.3333333333, ans=0.125 2023-09-29 17:08:15,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:08:15,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=432233.3333333333, ans=0.125 2023-09-29 17:08:15,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=432233.3333333333, ans=0.0 2023-09-29 17:08:18,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:18,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:18,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:18,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:22,346 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.07 vs. limit=15.0 2023-09-29 17:08:24,665 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.979e+02 2.212e+02 2.486e+02 3.871e+02, threshold=4.425e+02, percent-clipped=0.0 2023-09-29 17:08:24,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:24,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 17:08:26,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:26,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 17:08:26,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 17:08:27,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:08:29,425 INFO [train.py:1039] (2/4) Epoch 13, batch 1100, loss[loss=0.2158, simple_loss=0.2797, pruned_loss=0.07595, over 23744.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.266, pruned_loss=0.06132, over 4701551.08 frames. ], batch size: 179, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:08:29,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:08:36,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:08:41,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:08:43,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:08:43,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:43,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 17:08:45,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:08:48,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:08:49,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:08:53,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:08:53,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 17:08:54,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:08:54,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:54,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:58,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:08:59,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:09:05,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:09:08,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 17:09:10,244 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 17:09:10,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:13,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:15,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:09:15,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:09:17,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 17:09:17,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:09:18,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:09:18,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:09:18,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:18,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 17:09:24,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:09:26,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 17:09:28,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:09:30,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=432500.0, ans=0.125 2023-09-29 17:09:31,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:09:36,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 17:09:36,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:09:38,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:39,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:09:40,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=432566.6666666667, ans=0.07 2023-09-29 17:09:41,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:41,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 17:09:42,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:09:42,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:43,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 17:09:43,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:09:45,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 17:09:48,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:09:48,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:09:50,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:09:53,080 INFO [train.py:1039] (2/4) Epoch 13, batch 1150, loss[loss=0.1919, simple_loss=0.2631, pruned_loss=0.06036, over 23638.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2662, pruned_loss=0.0609, over 4714392.68 frames. ], batch size: 149, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:09:54,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:09:57,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:10:01,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:01,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:10:01,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 17:10:02,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:04,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 17:10:07,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:08,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:10:11,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=432700.0, ans=0.0 2023-09-29 17:10:12,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 17:10:15,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:16,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=432700.0, ans=0.5 2023-09-29 17:10:20,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:21,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:21,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 17:10:21,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:10:22,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=432700.0, ans=0.1 2023-09-29 17:10:23,298 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.37 vs. limit=15.0 2023-09-29 17:10:23,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:24,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=432766.6666666667, ans=0.125 2023-09-29 17:10:24,424 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-09-29 17:10:27,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 17:10:28,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:30,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:38,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:45,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:46,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 17:10:46,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:46,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:52,192 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.53 vs. limit=15.0 2023-09-29 17:10:53,209 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 17:10:54,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:03,417 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 17:11:06,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:08,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:11:09,441 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.852e+02 2.092e+02 2.448e+02 3.672e+02, threshold=4.183e+02, percent-clipped=0.0 2023-09-29 17:11:09,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:11:09,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:11:13,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:15,565 INFO [train.py:1039] (2/4) Epoch 13, batch 1200, loss[loss=0.1848, simple_loss=0.2619, pruned_loss=0.05386, over 24520.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2661, pruned_loss=0.06077, over 4721717.72 frames. ], batch size: 63, lr: 8.07e-03, grad_scale: 32.0 2023-09-29 17:11:17,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:11:17,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:11:18,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:18,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:19,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=432966.6666666667, ans=0.04949747468305833 2023-09-29 17:11:20,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:11:21,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:11:22,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=432966.6666666667, ans=0.125 2023-09-29 17:11:23,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:11:24,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:25,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:29,245 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 17:11:31,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 17:11:36,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:11:39,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:11:41,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:43,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:11:43,139 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 17:11:44,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:49,251 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.55 vs. limit=12.0 2023-09-29 17:11:51,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:11:51,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:11:53,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 17:11:53,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=433100.0, ans=0.0 2023-09-29 17:11:54,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:11:56,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=433100.0, ans=0.0 2023-09-29 17:11:59,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 17:11:59,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=433100.0, ans=0.125 2023-09-29 17:12:01,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.95 vs. limit=15.0 2023-09-29 17:12:03,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 17:12:04,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:12:05,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:12:07,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:09,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:12:11,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:12:11,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:12:13,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:12:13,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 17:12:13,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:12:15,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:15,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:12:18,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:18,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:22,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:12:24,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:12:27,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 17:12:31,732 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.24 vs. limit=15.0 2023-09-29 17:12:32,425 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 17:12:34,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:12:36,883 INFO [train.py:1039] (2/4) Epoch 13, batch 1250, loss[loss=0.2002, simple_loss=0.2839, pruned_loss=0.05824, over 23991.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2663, pruned_loss=0.06048, over 4720813.54 frames. ], batch size: 80, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:12:37,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:38,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:12:40,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:41,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 17:12:46,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.59 vs. limit=22.5 2023-09-29 17:12:47,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:12:47,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:12:49,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 17:12:49,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=433300.0, ans=0.125 2023-09-29 17:12:50,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:12:52,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:12:58,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:12:59,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:01,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:13:01,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:02,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:13:04,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:13:04,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:13:04,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:06,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:06,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:09,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:10,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:13:16,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 17:13:16,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=433433.3333333333, ans=0.0 2023-09-29 17:13:17,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:13:21,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:21,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 17:13:21,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=433433.3333333333, ans=0.0 2023-09-29 17:13:22,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:22,793 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 17:13:24,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:24,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:29,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:13:37,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 17:13:37,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 17:13:37,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 17:13:39,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:13:39,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 17:13:39,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:39,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.83 vs. limit=15.0 2023-09-29 17:13:42,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:13:44,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:13:45,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 17:13:45,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:13:47,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:13:47,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:13:48,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:50,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 17:13:52,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:54,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:13:55,653 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.895e+02 2.072e+02 2.279e+02 3.563e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-29 17:13:55,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:13:57,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:14:00,452 INFO [train.py:1039] (2/4) Epoch 13, batch 1300, loss[loss=0.1797, simple_loss=0.2613, pruned_loss=0.04902, over 24394.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.267, pruned_loss=0.06144, over 4708909.03 frames. ], batch size: 69, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:14:02,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:14:02,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 17:14:04,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=433633.3333333333, ans=0.1 2023-09-29 17:14:07,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:10,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:14:11,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:13,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:14:13,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:14:15,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 17:14:18,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=433700.0, ans=0.125 2023-09-29 17:14:19,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:14:21,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:14:23,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 17:14:26,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=433700.0, ans=0.0 2023-09-29 17:14:27,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:14:31,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:33,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:33,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:33,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=433766.6666666667, ans=0.0 2023-09-29 17:14:36,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:36,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:14:38,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:14:38,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 17:14:46,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:14:46,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:14:46,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 17:14:48,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:14:48,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:14:51,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:51,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 17:14:52,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:53,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 17:14:55,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:59,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:59,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:15:02,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 17:15:04,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 17:15:06,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 17:15:10,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:15:14,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 17:15:15,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:22,683 INFO [train.py:1039] (2/4) Epoch 13, batch 1350, loss[loss=0.1971, simple_loss=0.2588, pruned_loss=0.06769, over 23723.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2655, pruned_loss=0.06117, over 4703899.55 frames. ], batch size: 164, lr: 8.06e-03, grad_scale: 16.0 2023-09-29 17:15:22,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 17:15:25,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:26,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=433966.6666666667, ans=0.125 2023-09-29 17:15:28,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:33,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:33,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:35,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=433966.6666666667, ans=0.1 2023-09-29 17:15:36,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:15:36,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:40,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:42,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 17:15:43,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:15:43,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:15:46,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 17:15:47,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:15:49,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:15:49,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 17:15:50,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 17:15:53,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 17:15:53,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:55,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 17:16:07,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:18,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:18,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:19,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 17:16:22,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:23,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 17:16:23,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:16:23,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:16:24,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=434166.6666666667, ans=0.1 2023-09-29 17:16:26,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=434233.3333333333, ans=0.1 2023-09-29 17:16:28,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:16:30,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 17:16:31,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:16:37,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.49 vs. limit=15.0 2023-09-29 17:16:37,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 17:16:40,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 17:16:41,595 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.935e+02 2.126e+02 2.533e+02 4.347e+02, threshold=4.251e+02, percent-clipped=1.0 2023-09-29 17:16:43,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 17:16:44,927 INFO [train.py:1039] (2/4) Epoch 13, batch 1400, loss[loss=0.1691, simple_loss=0.2377, pruned_loss=0.0502, over 24289.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2648, pruned_loss=0.06097, over 4699413.77 frames. ], batch size: 56, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:16:47,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:50,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:16:52,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:16:56,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 17:16:58,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 17:17:10,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:17:11,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:13,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=434366.6666666667, ans=0.125 2023-09-29 17:17:14,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:17:14,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:17:18,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:17:20,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:17:26,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.46 vs. limit=15.0 2023-09-29 17:17:30,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:30,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:33,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=434500.0, ans=0.125 2023-09-29 17:17:34,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 17:17:34,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:17:36,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:17:37,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:17:39,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:39,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:17:39,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:17:41,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:17:43,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 17:17:43,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:17:47,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:48,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=434500.0, ans=0.0 2023-09-29 17:17:51,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:18:01,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 17:18:02,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=434566.6666666667, ans=0.0 2023-09-29 17:18:03,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:18:03,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:18:06,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 17:18:07,634 INFO [train.py:1039] (2/4) Epoch 13, batch 1450, loss[loss=0.203, simple_loss=0.2683, pruned_loss=0.06885, over 23775.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2644, pruned_loss=0.06036, over 4705371.29 frames. ], batch size: 164, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:18:07,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:12,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:18:13,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:18:17,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:18:17,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:17,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:18:22,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:22,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:18:25,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:18:25,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 17:18:27,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:18:27,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 17:18:28,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=434700.0, ans=0.125 2023-09-29 17:18:29,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:29,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:29,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 17:18:31,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:18:33,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:18:33,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 17:18:33,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:34,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:18:36,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:38,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:42,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:18:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:18:45,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:45,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:48,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:48,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:18:48,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:50,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:18:53,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 17:18:55,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:19:00,113 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 17:19:02,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:03,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:19:03,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:05,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 17:19:09,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:11,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 17:19:12,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=434900.0, ans=0.035 2023-09-29 17:19:14,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 17:19:15,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:17,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:18,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:20,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 17:19:22,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 17:19:23,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 17:19:25,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:25,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:19:27,163 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.823e+02 1.982e+02 2.343e+02 3.097e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-29 17:19:30,425 INFO [train.py:1039] (2/4) Epoch 13, batch 1500, loss[loss=0.2085, simple_loss=0.2754, pruned_loss=0.07083, over 23258.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2651, pruned_loss=0.06044, over 4714774.15 frames. ], batch size: 105, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:19:32,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=434966.6666666667, ans=0.125 2023-09-29 17:19:38,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 17:19:39,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:19:39,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:19:40,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:42,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:42,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:19:44,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 17:19:45,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:19:45,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:19:45,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:47,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:47,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:19:49,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:51,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=435033.3333333333, ans=0.05 2023-09-29 17:19:54,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:54,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 17:19:54,779 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.55 vs. limit=15.0 2023-09-29 17:19:55,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:19:55,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:19:57,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:20:01,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 17:20:06,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 17:20:08,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:20:08,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 17:20:11,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:20:13,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:14,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:20:16,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:20:18,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 17:20:18,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:20:18,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:20,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 17:20:20,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:26,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:20:26,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 17:20:29,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=435166.6666666667, ans=0.0 2023-09-29 17:20:32,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:20:34,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:20:39,156 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 17:20:39,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:40,607 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 17:20:40,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:20:42,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:20:42,792 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 17:20:44,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:20:47,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 17:20:48,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:52,191 INFO [train.py:1039] (2/4) Epoch 13, batch 1550, loss[loss=0.1924, simple_loss=0.2592, pruned_loss=0.06282, over 23879.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2653, pruned_loss=0.06085, over 4722572.37 frames. ], batch size: 195, lr: 8.04e-03, grad_scale: 16.0 2023-09-29 17:20:54,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:54,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:55,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:56,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:57,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 17:20:57,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 17:20:59,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:21:00,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 17:21:00,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 17:21:02,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:04,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:05,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:05,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:21:07,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:07,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:10,239 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 17:21:10,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:10,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:21:10,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:21:10,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=435366.6666666667, ans=0.125 2023-09-29 17:21:15,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:21:15,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 17:21:17,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:17,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 17:21:19,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 17:21:19,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 17:21:19,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:19,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=435366.6666666667, ans=0.125 2023-09-29 17:21:20,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:25,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:21:27,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 17:21:27,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 17:21:35,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:40,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:40,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:21:40,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:21:40,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 17:21:46,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:21:49,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:51,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:21:53,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:21:53,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:55,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 17:21:55,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:21:57,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:21:57,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=435566.6666666667, ans=0.0 2023-09-29 17:21:58,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:58,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:21:58,675 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 17:22:01,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:08,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 17:22:09,252 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.52 vs. limit=10.0 2023-09-29 17:22:11,512 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 2.000e+02 2.251e+02 2.787e+02 4.721e+02, threshold=4.502e+02, percent-clipped=2.0 2023-09-29 17:22:11,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:13,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:22:14,589 INFO [train.py:1039] (2/4) Epoch 13, batch 1600, loss[loss=0.1961, simple_loss=0.2724, pruned_loss=0.05992, over 23388.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2657, pruned_loss=0.06094, over 4724556.71 frames. ], batch size: 93, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:22:14,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 17:22:16,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:22:16,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:16,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:22:16,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:22:18,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:22:22,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:22,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 17:22:24,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 17:22:26,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 17:22:28,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:22:28,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=435633.3333333333, ans=0.2 2023-09-29 17:22:30,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 17:22:31,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:22:34,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:22:39,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:22:44,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 17:22:44,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=435700.0, ans=0.04949747468305833 2023-09-29 17:22:46,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:22:46,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 17:22:46,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:47,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 17:22:51,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 17:22:58,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:23:00,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 17:23:00,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:23:02,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:02,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:23:06,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 17:23:09,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 17:23:11,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:23:13,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:14,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:14,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:23:16,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:23:17,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:23:19,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:23:25,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:27,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:23:30,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 17:23:30,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:23:30,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 17:23:30,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=435900.0, ans=0.125 2023-09-29 17:23:35,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:37,182 INFO [train.py:1039] (2/4) Epoch 13, batch 1650, loss[loss=0.1673, simple_loss=0.2466, pruned_loss=0.04404, over 24568.00 frames. ], tot_loss[loss=0.1953, simple_loss=0.2669, pruned_loss=0.06185, over 4731521.25 frames. ], batch size: 60, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:23:37,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:23:37,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:23:37,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 17:23:38,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 17:23:38,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 17:23:38,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 17:23:43,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:43,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:45,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:23:45,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:23:48,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:50,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 17:23:50,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=435966.6666666667, ans=0.1 2023-09-29 17:23:50,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=435966.6666666667, ans=0.2 2023-09-29 17:23:53,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:54,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:54,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:23:54,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:23:54,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 17:23:54,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 17:23:59,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:24:01,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:24:05,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=436033.3333333333, ans=0.125 2023-09-29 17:24:11,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 17:24:13,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:14,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 17:24:18,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:20,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:24:20,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:24:21,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:23,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:24:23,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:26,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:24:27,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:28,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=436166.6666666667, ans=0.125 2023-09-29 17:24:29,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:29,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:30,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:30,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:24:36,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:36,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 17:24:39,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:39,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 17:24:41,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 17:24:41,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 17:24:43,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:43,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:24:43,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:44,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:44,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 17:24:50,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:50,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=436233.3333333333, ans=0.1 2023-09-29 17:24:51,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:24:51,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:51,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=436233.3333333333, ans=0.125 2023-09-29 17:24:55,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 17:24:56,551 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.021e+02 2.239e+02 2.775e+02 4.189e+02, threshold=4.478e+02, percent-clipped=0.0 2023-09-29 17:24:59,919 INFO [train.py:1039] (2/4) Epoch 13, batch 1700, loss[loss=0.1869, simple_loss=0.2427, pruned_loss=0.06559, over 23642.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.267, pruned_loss=0.06199, over 4712284.11 frames. ], batch size: 256, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:25:00,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:25:00,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:25:00,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 17:25:00,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:00,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:25:00,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:03,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:25:04,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:25:04,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 17:25:07,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:25:16,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:18,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:25:24,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:25:24,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:26,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:26,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:29,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 17:25:32,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:25:33,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:33,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:25:34,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:25:36,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 17:25:37,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 17:25:39,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:40,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 17:25:44,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:25:53,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:25:53,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:25:54,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:56,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:25:56,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 17:25:57,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:59,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:59,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 17:26:01,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:01,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:01,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:01,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:04,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:04,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:26:04,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:04,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:26:04,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:09,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:11,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 17:26:11,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=12.0 2023-09-29 17:26:14,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:16,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:17,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 17:26:22,898 INFO [train.py:1039] (2/4) Epoch 13, batch 1750, loss[loss=0.1901, simple_loss=0.2559, pruned_loss=0.06213, over 23380.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2654, pruned_loss=0.06097, over 4715109.38 frames. ], batch size: 285, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:26:24,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:26,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:28,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:26:28,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 17:26:28,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:32,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:26:32,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:39,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 17:26:40,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:43,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 17:26:44,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:45,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:26:48,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:26:50,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 17:26:52,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:52,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 17:26:57,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=436766.6666666667, ans=0.125 2023-09-29 17:27:00,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:27:00,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=436766.6666666667, ans=0.125 2023-09-29 17:27:01,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=436766.6666666667, ans=0.125 2023-09-29 17:27:04,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:04,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:07,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:07,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:11,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:27:12,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:14,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:14,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:27:15,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 17:27:16,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=436833.3333333333, ans=0.5 2023-09-29 17:27:17,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:20,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 17:27:21,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:22,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:22,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:27:28,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:27:28,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:27:29,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:32,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:37,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:37,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=436900.0, ans=0.0 2023-09-29 17:27:40,259 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.23 vs. limit=22.5 2023-09-29 17:27:40,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:27:42,032 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.899e+02 2.041e+02 2.421e+02 3.023e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-29 17:27:43,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:27:45,437 INFO [train.py:1039] (2/4) Epoch 13, batch 1800, loss[loss=0.2079, simple_loss=0.2689, pruned_loss=0.07344, over 23659.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2647, pruned_loss=0.06061, over 4712873.17 frames. ], batch size: 164, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:27:45,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 17:27:45,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:48,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:27:48,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:27:48,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:27:48,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:27:48,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:27:51,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:27:51,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:55,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:27:56,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:59,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:28:01,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:28:04,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:08,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:28:12,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:28:12,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 17:28:13,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:14,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=437033.3333333333, ans=0.2 2023-09-29 17:28:14,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=437033.3333333333, ans=0.1 2023-09-29 17:28:18,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:22,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 17:28:23,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 17:28:23,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 17:28:25,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:26,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:26,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:28:28,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:28:32,926 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 17:28:33,846 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.95 vs. limit=15.0 2023-09-29 17:28:34,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:28:38,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:38,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 17:28:39,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 17:28:40,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=437166.6666666667, ans=0.1 2023-09-29 17:28:41,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:28:43,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:28:45,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:28:48,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 17:28:54,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:28:56,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 17:28:56,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:28:56,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:56,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:28:58,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 17:29:01,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:29:01,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:03,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 17:29:03,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:04,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:04,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:29:04,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,685 INFO [train.py:1039] (2/4) Epoch 13, batch 1850, loss[loss=0.2053, simple_loss=0.2835, pruned_loss=0.06353, over 23914.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2652, pruned_loss=0.06075, over 4718653.63 frames. ], batch size: 86, lr: 8.03e-03, grad_scale: 16.0 2023-09-29 17:29:07,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:29:09,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:29:10,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:11,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=437300.0, ans=0.0 2023-09-29 17:29:14,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:29:14,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:29:22,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:29:22,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 17:29:26,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 17:29:29,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 17:29:31,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=437366.6666666667, ans=0.0 2023-09-29 17:29:32,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:34,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 17:29:34,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 17:29:40,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=437433.3333333333, ans=0.125 2023-09-29 17:29:41,236 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.71 vs. limit=12.0 2023-09-29 17:29:43,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:29:47,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 17:29:47,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:29:49,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:29:51,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=437433.3333333333, ans=10.0 2023-09-29 17:29:52,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 17:29:52,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:54,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:29:54,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:29:58,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:30:01,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:05,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:30:05,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:05,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:30:05,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:08,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:10,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:30:13,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 17:30:13,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:13,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=437566.6666666667, ans=0.125 2023-09-29 17:30:15,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=437566.6666666667, ans=0.125 2023-09-29 17:30:16,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:30:17,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:30:17,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 17:30:17,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 17:30:18,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=437566.6666666667, ans=0.125 2023-09-29 17:30:20,058 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 17:30:21,353 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 17:30:24,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:30:24,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:30:24,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:24,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:25,099 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 17:30:25,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:30:27,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:28,476 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.931e+02 2.222e+02 2.775e+02 3.962e+02, threshold=4.445e+02, percent-clipped=0.0 2023-09-29 17:30:28,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:30:28,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:30:30,309 INFO [train.py:1039] (2/4) Epoch 13, batch 1900, loss[loss=0.1817, simple_loss=0.2547, pruned_loss=0.05435, over 17210.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2664, pruned_loss=0.06164, over 4707892.03 frames. ], batch size: 37, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:30:30,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:30:30,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 17:30:33,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:33,568 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 17:30:33,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:30:35,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:40,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:43,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:30:43,751 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 17:30:45,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 17:30:45,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:46,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:46,816 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 17:30:48,257 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 17:30:51,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 17:30:52,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:30:55,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=437700.0, ans=0.1 2023-09-29 17:30:56,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 17:31:00,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 17:31:00,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=437700.0, ans=0.0 2023-09-29 17:31:11,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 17:31:13,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 17:31:14,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:31:14,888 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 17:31:14,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 17:31:14,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 17:31:16,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 17:31:16,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:31:21,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 17:31:21,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=437833.3333333333, ans=0.125 2023-09-29 17:31:25,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:31:27,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:27,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 17:31:28,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:31:34,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 17:31:34,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=437900.0, ans=0.0 2023-09-29 17:31:36,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:41,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:31:41,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:31:41,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:31:42,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:31:46,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:31:46,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:31:46,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:31:49,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:49,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:31:51,384 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.58 vs. limit=15.0 2023-09-29 17:31:52,142 INFO [train.py:1039] (2/4) Epoch 13, batch 1950, loss[loss=0.177, simple_loss=0.2544, pruned_loss=0.04976, over 24297.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2666, pruned_loss=0.06147, over 4714707.83 frames. ], batch size: 61, lr: 8.02e-03, grad_scale: 8.0 2023-09-29 17:31:52,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:31:52,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:52,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:53,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:55,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:00,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:32:00,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:00,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:32:01,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 17:32:04,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:32:04,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:06,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:08,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:32:10,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:10,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:12,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:16,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:16,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:32:17,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:32:17,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:20,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:22,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=438033.3333333333, ans=0.125 2023-09-29 17:32:24,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:32:24,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:24,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:32:24,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 17:32:24,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:32:24,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:32:25,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:29,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:29,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=438100.0, ans=0.0 2023-09-29 17:32:30,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:32:35,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:32:38,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:32:39,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:32:40,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 17:32:40,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:32:45,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:47,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:32:48,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:32:56,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:57,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:59,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:00,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:03,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:33:03,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:05,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 17:33:05,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:33:06,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:33:08,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 17:33:09,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=438233.3333333333, ans=0.0 2023-09-29 17:33:11,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:14,894 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.174e+02 2.503e+02 4.017e+02, threshold=4.347e+02, percent-clipped=0.0 2023-09-29 17:33:14,936 INFO [train.py:1039] (2/4) Epoch 13, batch 2000, loss[loss=0.1894, simple_loss=0.2505, pruned_loss=0.06417, over 22603.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2675, pruned_loss=0.06135, over 4725762.41 frames. ], batch size: 322, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:33:15,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:33:17,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:33:17,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:33:18,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:33:21,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:24,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 17:33:25,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:33:28,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:33:28,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=438300.0, ans=0.125 2023-09-29 17:33:30,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=438366.6666666667, ans=0.125 2023-09-29 17:33:31,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 17:33:33,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:33:33,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:36,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:33:37,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 17:33:40,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:43,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:43,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:44,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 17:33:45,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:33:46,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 17:33:46,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:33:50,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:33:52,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:33:52,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:52,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=438433.3333333333, ans=0.1 2023-09-29 17:33:54,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:33:54,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:33:56,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 17:34:00,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 17:34:00,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:34:00,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:06,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:07,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:34:07,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:07,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:34:09,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:10,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:10,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:10,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:12,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:15,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:34:15,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 17:34:19,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=438566.6666666667, ans=0.0 2023-09-29 17:34:20,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=438566.6666666667, ans=0.0 2023-09-29 17:34:21,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:34:23,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:27,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:27,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:34:29,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.15 vs. limit=22.5 2023-09-29 17:34:33,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:35,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:35,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:36,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:34:36,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:34:37,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=438633.3333333333, ans=0.1 2023-09-29 17:34:37,992 INFO [train.py:1039] (2/4) Epoch 13, batch 2050, loss[loss=0.2001, simple_loss=0.2452, pruned_loss=0.07754, over 19505.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2673, pruned_loss=0.06158, over 4715567.78 frames. ], batch size: 388, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:34:38,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:39,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:42,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:44,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:49,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:52,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:34:52,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:52,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=438700.0, ans=0.125 2023-09-29 17:34:53,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:34:54,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 17:34:54,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:34:55,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:55,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:35:09,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:09,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:11,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 17:35:12,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-09-29 17:35:13,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:14,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 17:35:14,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:19,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:20,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:22,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:35:22,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:24,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:35:25,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:35:25,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:35:30,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:32,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:35:35,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:35:35,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:35:39,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:35:44,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:35:46,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 17:35:50,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:35:52,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:35:53,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:35:55,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 17:35:59,775 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.857e+02 2.010e+02 2.339e+02 3.458e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-29 17:35:59,819 INFO [train.py:1039] (2/4) Epoch 13, batch 2100, loss[loss=0.1864, simple_loss=0.2732, pruned_loss=0.04977, over 24432.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2658, pruned_loss=0.06151, over 4709421.22 frames. ], batch size: 66, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:35:59,960 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 17:35:59,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:01,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:01,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:03,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:36:03,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 17:36:03,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 17:36:05,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:36:08,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:36:09,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:36:11,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:11,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:36:11,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 17:36:13,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:36:13,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 17:36:13,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 17:36:15,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:17,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:36:17,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 17:36:17,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 17:36:23,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 17:36:23,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:27,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:36:27,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:28,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=439033.3333333333, ans=0.125 2023-09-29 17:36:30,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:36:31,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 17:36:32,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:32,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:36:34,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 17:36:35,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:35,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 17:36:37,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 17:36:37,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 17:36:39,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:36:42,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:36:44,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:44,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:47,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:49,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:49,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 17:36:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:49,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=439166.6666666667, ans=0.2 2023-09-29 17:36:51,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:52,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:52,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 17:36:53,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 17:36:54,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 17:36:56,304 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.66 vs. limit=6.0 2023-09-29 17:36:58,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:37:01,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:37:02,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.12 vs. limit=15.0 2023-09-29 17:37:03,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 17:37:07,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:09,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:37:10,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:10,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:10,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:37:10,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:37:13,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:13,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:37:14,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:37:14,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:16,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 17:37:18,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 17:37:18,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:21,812 INFO [train.py:1039] (2/4) Epoch 13, batch 2150, loss[loss=0.1998, simple_loss=0.2841, pruned_loss=0.05774, over 24560.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2647, pruned_loss=0.06061, over 4720056.03 frames. ], batch size: 71, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:37:22,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:37:22,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:37:23,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:37:23,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:37:30,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:37:31,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:33,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:34,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:37:34,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:36,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:37:36,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=439366.6666666667, ans=0.0 2023-09-29 17:37:39,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:40,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:37:40,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:37:42,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:42,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 17:37:47,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=439366.6666666667, ans=0.1 2023-09-29 17:37:48,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:37:50,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:37:52,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:52,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:52,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:53,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:37:53,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:53,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:55,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:56,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 17:37:58,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:37:59,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:59,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:01,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:38:01,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:38:02,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.30 vs. limit=15.0 2023-09-29 17:38:03,894 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.78 vs. limit=10.0 2023-09-29 17:38:05,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:06,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:38:06,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:06,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 17:38:06,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:38:08,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=439433.3333333333, ans=0.125 2023-09-29 17:38:11,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:12,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:13,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:14,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:38:14,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:16,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:16,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 17:38:18,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 17:38:19,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:38:19,987 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 17:38:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:22,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:38:23,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 17:38:23,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:38:23,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 17:38:23,599 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 17:38:23,600 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 17:38:23,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 17:38:25,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:26,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:38:26,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:38:28,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:29,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:38:31,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:31,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:39,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:38:41,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 17:38:44,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.873e+02 2.053e+02 2.392e+02 4.399e+02, threshold=4.106e+02, percent-clipped=1.0 2023-09-29 17:38:44,305 INFO [train.py:1039] (2/4) Epoch 13, batch 2200, loss[loss=0.1928, simple_loss=0.2689, pruned_loss=0.05832, over 24449.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2653, pruned_loss=0.06114, over 4715728.07 frames. ], batch size: 69, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:38:44,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:38:49,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:50,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:38:53,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:53,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:38:57,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:57,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:57,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 17:39:03,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 17:39:03,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:39:06,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=439700.0, ans=0.2 2023-09-29 17:39:08,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 17:39:11,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:11,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=439700.0, ans=0.125 2023-09-29 17:39:12,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:13,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:39:14,435 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-09-29 17:39:18,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:39:19,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 17:39:20,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=439766.6666666667, ans=0.0 2023-09-29 17:39:23,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:39:26,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:28,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 17:39:31,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:39:33,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:35,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:39:37,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:38,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 17:39:40,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:41,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 17:39:44,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:44,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:39:44,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:48,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:48,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:48,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:39:49,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:39:51,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:39:54,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:39:54,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:39:58,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:39:58,434 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 17:40:01,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:40:01,900 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 17:40:02,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:40:03,480 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 17:40:03,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=439900.0, ans=0.0 2023-09-29 17:40:05,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:05,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:40:07,066 INFO [train.py:1039] (2/4) Epoch 13, batch 2250, loss[loss=0.1884, simple_loss=0.2679, pruned_loss=0.05442, over 24652.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2659, pruned_loss=0.06153, over 4717566.03 frames. ], batch size: 65, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:40:08,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:08,775 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 17:40:11,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:40:13,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:19,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:40:21,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:40:25,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:27,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:27,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:28,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 17:40:30,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:40:30,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:40:33,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 17:40:34,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:40:34,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:37,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:43,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:40:45,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:40:45,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:40:45,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=440100.0, ans=0.125 2023-09-29 17:40:46,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 17:40:48,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:49,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:40:55,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:57,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:58,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:58,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:41:00,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:41:02,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:41:06,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:41:07,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:41:15,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:41:16,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:41:18,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:41:23,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:41:26,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:41:26,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 17:41:26,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:26,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:41:27,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 17:41:29,317 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.929e+02 2.161e+02 2.428e+02 3.244e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-29 17:41:29,360 INFO [train.py:1039] (2/4) Epoch 13, batch 2300, loss[loss=0.1861, simple_loss=0.2665, pruned_loss=0.05289, over 24398.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2655, pruned_loss=0.0608, over 4728033.00 frames. ], batch size: 77, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:41:32,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:41:33,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:38,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:38,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:41:42,394 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 17:41:43,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:50,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=440366.6666666667, ans=0.1 2023-09-29 17:41:52,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:41:52,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:41:54,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:41:54,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:54,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 17:41:55,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:41:58,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:41:59,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:42:02,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:42:06,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:42:08,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:13,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:42:13,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:42:15,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.74 vs. limit=22.5 2023-09-29 17:42:16,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:42:19,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:42:23,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:42:24,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:42:25,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:42:25,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 17:42:30,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:42:30,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:31,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:32,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:42:33,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:34,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 17:42:34,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:42:35,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 17:42:35,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:42:35,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:35,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 17:42:41,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:42:44,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:42:49,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:49,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:42:51,352 INFO [train.py:1039] (2/4) Epoch 13, batch 2350, loss[loss=0.1744, simple_loss=0.2525, pruned_loss=0.04817, over 24461.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2666, pruned_loss=0.06093, over 4734292.94 frames. ], batch size: 63, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:42:51,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:42:51,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:42:51,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:42:51,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=440633.3333333333, ans=0.125 2023-09-29 17:42:53,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:42:53,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 17:42:59,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=440633.3333333333, ans=0.0 2023-09-29 17:43:00,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:00,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 17:43:05,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=440633.3333333333, ans=22.5 2023-09-29 17:43:07,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 17:43:10,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:43:13,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:14,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:14,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:14,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:15,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 17:43:18,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:43:25,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 17:43:26,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:30,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:43:30,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:43:32,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=440766.6666666667, ans=0.125 2023-09-29 17:43:33,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:43:35,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 17:43:35,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:43:38,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:39,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:43:39,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:43:42,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:43:43,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 17:43:45,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:46,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:46,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:43:49,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 17:43:49,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:43:53,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=15.0 2023-09-29 17:43:54,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 17:43:54,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:43:59,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 17:44:04,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 17:44:04,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:44:04,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:44:05,805 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 17:44:05,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 17:44:06,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=440900.0, ans=0.0 2023-09-29 17:44:07,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 17:44:10,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:44:12,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=440900.0, ans=0.0 2023-09-29 17:44:14,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.15 vs. limit=15.0 2023-09-29 17:44:14,659 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.807e+02 2.059e+02 2.357e+02 3.650e+02, threshold=4.118e+02, percent-clipped=0.0 2023-09-29 17:44:14,708 INFO [train.py:1039] (2/4) Epoch 13, batch 2400, loss[loss=0.2022, simple_loss=0.2779, pruned_loss=0.06323, over 23230.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2657, pruned_loss=0.06054, over 4742578.87 frames. ], batch size: 105, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:44:14,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:44:17,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:44:20,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:44:21,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 17:44:21,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 17:44:30,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:44:30,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:44:32,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 17:44:32,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:44:32,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:33,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 17:44:40,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:42,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 17:44:47,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:44:50,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 17:44:53,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:44:55,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:56,008 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.68 vs. limit=22.5 2023-09-29 17:44:59,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:44:59,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 17:44:59,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:45:08,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:12,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:14,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:16,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:45:17,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:45:17,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:45:17,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:17,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:19,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:45:23,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:45:24,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:45:24,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 17:45:25,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 17:45:26,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:45:26,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:26,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 17:45:28,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 17:45:29,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 17:45:29,852 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 17:45:31,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 17:45:32,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:45:34,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:34,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:35,902 INFO [train.py:1039] (2/4) Epoch 13, batch 2450, loss[loss=0.177, simple_loss=0.2415, pruned_loss=0.05628, over 23713.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2633, pruned_loss=0.06026, over 4701756.54 frames. ], batch size: 135, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:45:36,038 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 17:45:36,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:37,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:45:42,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:45:42,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:48,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:48,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:50,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 17:45:56,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:56,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:59,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:45:59,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:45:59,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:45:59,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 17:46:04,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:05,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:46:07,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:46:09,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=441433.3333333333, ans=0.125 2023-09-29 17:46:10,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:46:10,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:11,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:12,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:46:13,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 17:46:15,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:46:23,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:24,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:24,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:25,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:46:25,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:26,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:46:28,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 17:46:28,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=441500.0, ans=0.0 2023-09-29 17:46:31,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:31,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:46:31,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=441500.0, ans=0.125 2023-09-29 17:46:35,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:46:35,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:40,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:46:40,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 17:46:42,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:46:43,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:46:45,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 17:46:45,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:46:46,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:46:49,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:46:53,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:53,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:46:58,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 17:46:59,951 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.935e+02 2.161e+02 2.595e+02 3.888e+02, threshold=4.322e+02, percent-clipped=0.0 2023-09-29 17:46:59,996 INFO [train.py:1039] (2/4) Epoch 13, batch 2500, loss[loss=0.1796, simple_loss=0.2482, pruned_loss=0.05554, over 23578.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2633, pruned_loss=0.05997, over 4699616.58 frames. ], batch size: 134, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:47:00,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:47:06,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:15,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:47:15,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:47:17,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:17,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 17:47:25,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:47:25,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:47:28,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:47:29,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:47:29,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 17:47:29,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:29,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:31,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 17:47:32,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:32,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 17:47:33,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:38,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:47:39,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:41,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:47:41,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 17:47:41,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:47:43,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=441766.6666666667, ans=0.0 2023-09-29 17:47:44,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:47,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:51,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:54,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:47:57,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-09-29 17:48:00,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:48:05,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 17:48:05,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:05,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:06,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:48:06,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:48:08,471 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 17:48:08,472 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 17:48:08,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 17:48:12,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:15,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 17:48:17,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 17:48:17,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:48:18,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.21 vs. limit=15.0 2023-09-29 17:48:18,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 17:48:22,049 INFO [train.py:1039] (2/4) Epoch 13, batch 2550, loss[loss=0.1626, simple_loss=0.234, pruned_loss=0.04557, over 24393.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2639, pruned_loss=0.06021, over 4703161.42 frames. ], batch size: 56, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:48:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 17:48:25,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:26,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:48:26,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:48:28,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:30,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 17:48:30,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:48:31,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=441966.6666666667, ans=0.1 2023-09-29 17:48:36,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 17:48:39,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:48:43,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:43,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:43,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 17:48:44,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:48:44,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:48:44,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:47,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:48:47,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 17:48:48,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:48,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:48,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 17:48:52,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=442033.3333333333, ans=0.0 2023-09-29 17:48:59,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=442100.0, ans=0.125 2023-09-29 17:49:00,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:49:05,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:05,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:05,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:49:07,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:49:08,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=442100.0, ans=0.2 2023-09-29 17:49:12,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:49:16,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:49:17,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:49:17,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:49:17,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:49:19,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:49:22,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:23,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:25,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=442166.6666666667, ans=0.125 2023-09-29 17:49:26,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:49:26,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 17:49:26,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:49:28,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:29,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:49:31,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:49:32,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:33,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=442233.3333333333, ans=0.125 2023-09-29 17:49:34,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=442233.3333333333, ans=0.125 2023-09-29 17:49:34,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=442233.3333333333, ans=0.0 2023-09-29 17:49:37,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:49:41,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:44,604 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.935e+02 2.262e+02 2.614e+02 3.523e+02, threshold=4.524e+02, percent-clipped=0.0 2023-09-29 17:49:44,648 INFO [train.py:1039] (2/4) Epoch 13, batch 2600, loss[loss=0.2501, simple_loss=0.3006, pruned_loss=0.09977, over 19475.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2648, pruned_loss=0.06039, over 4710612.54 frames. ], batch size: 388, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:49:46,258 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 17:49:49,401 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 17:49:50,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:49:50,854 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 17:49:50,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 17:49:51,006 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 17:49:54,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:54,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=442300.0, ans=0.125 2023-09-29 17:49:56,091 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 17:49:56,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 17:49:57,730 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 17:49:59,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:50:02,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 17:50:02,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 17:50:05,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:50:05,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 17:50:08,493 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 17:50:08,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 17:50:17,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:17,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:19,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:19,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 17:50:19,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=442433.3333333333, ans=0.0 2023-09-29 17:50:21,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:50:22,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=442433.3333333333, ans=0.1 2023-09-29 17:50:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 17:50:27,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=442433.3333333333, ans=0.125 2023-09-29 17:50:32,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:33,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:35,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 17:50:36,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:36,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:36,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 17:50:38,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:50:38,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:50:41,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,851 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 17:50:45,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:50:51,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:52,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.68 vs. limit=15.0 2023-09-29 17:50:52,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:50:52,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 17:50:54,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:56,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:50:56,905 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.26 vs. limit=15.0 2023-09-29 17:50:57,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:51:02,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.17 vs. limit=15.0 2023-09-29 17:51:03,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 17:51:04,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:07,438 INFO [train.py:1039] (2/4) Epoch 13, batch 2650, loss[loss=0.2183, simple_loss=0.2801, pruned_loss=0.07829, over 22746.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2653, pruned_loss=0.0603, over 4720805.36 frames. ], batch size: 322, lr: 7.98e-03, grad_scale: 16.0 2023-09-29 17:51:07,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:51:10,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 17:51:10,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:12,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:51:12,552 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 17:51:12,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:15,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:19,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:51:20,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:51:24,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:51:25,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 17:51:25,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:51:25,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:51:27,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 17:51:29,825 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 17:51:32,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:51:35,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 17:51:35,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:51:37,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 17:51:40,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=442766.6666666667, ans=0.07 2023-09-29 17:51:41,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=442766.6666666667, ans=0.1 2023-09-29 17:51:42,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:51:42,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:47,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 17:51:47,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 17:51:50,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:51:55,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 17:51:55,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:55,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:57,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:51:57,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:58,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=442833.3333333333, ans=0.0 2023-09-29 17:51:59,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:51:59,973 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.56 vs. limit=15.0 2023-09-29 17:52:00,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:52:03,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:04,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:52:06,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:52:07,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:52:07,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:09,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:52:11,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:11,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:12,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:52:16,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:16,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:52:16,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:18,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 17:52:19,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:52:22,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:24,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:24,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:25,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:52:26,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:29,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:52:29,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 17:52:31,154 INFO [train.py:1039] (2/4) Epoch 13, batch 2700, loss[loss=0.2054, simple_loss=0.2708, pruned_loss=0.07, over 23668.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2659, pruned_loss=0.06095, over 4718268.55 frames. ], batch size: 149, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:52:32,536 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.954e+02 2.253e+02 2.566e+02 4.959e+02, threshold=4.505e+02, percent-clipped=1.0 2023-09-29 17:52:32,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:52:36,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 17:52:36,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=442966.6666666667, ans=0.125 2023-09-29 17:52:38,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:52:38,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:39,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:39,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:52:39,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:39,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:52:39,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:52:40,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 17:52:41,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:52:41,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:52:43,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:52:44,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:48,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:52:50,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 17:52:50,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:52:56,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:52:56,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:01,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:53:01,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:53:01,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:53:01,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:53:03,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:07,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:07,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:53:09,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:53:12,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:12,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:53:23,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:53:23,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:53:24,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=443166.6666666667, ans=0.1 2023-09-29 17:53:25,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=443166.6666666667, ans=0.125 2023-09-29 17:53:27,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:53:27,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:31,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:33,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:33,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:34,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:36,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:36,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=443233.3333333333, ans=0.1 2023-09-29 17:53:37,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:53:40,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:53:42,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:42,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:46,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 17:53:46,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:48,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:53:48,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 17:53:49,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 17:53:51,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:53,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:53:53,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:54,533 INFO [train.py:1039] (2/4) Epoch 13, batch 2750, loss[loss=0.2147, simple_loss=0.2592, pruned_loss=0.08511, over 19705.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2653, pruned_loss=0.06118, over 4711822.03 frames. ], batch size: 388, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:53:57,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:57,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:53:57,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:01,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:54:01,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:54:01,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 17:54:01,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:54:02,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:54:02,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=443300.0, ans=0.125 2023-09-29 17:54:03,614 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=15.0 2023-09-29 17:54:08,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 17:54:08,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=443366.6666666667, ans=0.125 2023-09-29 17:54:10,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:54:12,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:13,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:14,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:54:15,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:54:16,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:54:17,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:17,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=443366.6666666667, ans=0.0 2023-09-29 17:54:18,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:21,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:54:21,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:54:23,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:54:23,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:26,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:54:35,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:37,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:54:37,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:38,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.71 vs. limit=15.0 2023-09-29 17:54:39,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=443433.3333333333, ans=0.2 2023-09-29 17:54:42,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:42,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:54:43,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:54:49,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=443500.0, ans=0.125 2023-09-29 17:54:51,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:54:51,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:51,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 17:54:51,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=443500.0, ans=0.09899494936611666 2023-09-29 17:54:57,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:58,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 17:55:00,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=443566.6666666667, ans=0.2 2023-09-29 17:55:04,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:55:06,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:55:06,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 17:55:07,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:09,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:55:09,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 17:55:10,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:55:14,430 INFO [train.py:1039] (2/4) Epoch 13, batch 2800, loss[loss=0.2145, simple_loss=0.2722, pruned_loss=0.07838, over 23821.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2641, pruned_loss=0.06116, over 4700981.32 frames. ], batch size: 195, lr: 7.97e-03, grad_scale: 32.0 2023-09-29 17:55:14,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 17:55:15,771 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.948e+02 2.222e+02 2.625e+02 4.530e+02, threshold=4.443e+02, percent-clipped=1.0 2023-09-29 17:55:15,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:15,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:55:17,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 17:55:17,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:17,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:19,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=443633.3333333333, ans=0.1 2023-09-29 17:55:21,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:21,115 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 17:55:21,116 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 17:55:23,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:26,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:55:26,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:55:30,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:55:33,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 17:55:35,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 17:55:36,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 17:55:36,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:38,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:55:38,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:55:43,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:55:43,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:43,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:55:44,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:55:52,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:55:56,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:58,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:59,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:59,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:05,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:06,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 17:56:06,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:07,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:07,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:56:10,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=443833.3333333333, ans=0.0 2023-09-29 17:56:11,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:11,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=443833.3333333333, ans=0.0 2023-09-29 17:56:12,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:17,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:18,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:56:18,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:18,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:56:20,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:56:21,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:56:21,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=443900.0, ans=0.2 2023-09-29 17:56:22,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:56:22,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 17:56:22,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:23,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.86 vs. limit=22.5 2023-09-29 17:56:24,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:56:24,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:25,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 17:56:27,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:27,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:56:27,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=443900.0, ans=0.0 2023-09-29 17:56:29,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:56:30,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 17:56:37,555 INFO [train.py:1039] (2/4) Epoch 13, batch 2850, loss[loss=0.1795, simple_loss=0.2502, pruned_loss=0.05438, over 24297.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2643, pruned_loss=0.06033, over 4712887.65 frames. ], batch size: 56, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:56:37,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:37,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:56:39,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:56:41,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:56:44,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:56:45,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:56:46,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:50,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:50,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:52,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:56:52,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 17:56:57,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=444033.3333333333, ans=0.2 2023-09-29 17:57:00,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 17:57:00,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:02,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 17:57:02,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:04,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 17:57:05,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 17:57:07,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:20,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:22,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:22,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:57:23,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:57:23,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:57:23,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:57:25,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:57:25,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 17:57:27,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:57:27,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:57:27,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=444166.6666666667, ans=0.125 2023-09-29 17:57:28,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:30,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:33,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:33,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:37,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:38,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:40,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:57:41,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:42,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:44,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=444233.3333333333, ans=0.0 2023-09-29 17:57:45,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:57:47,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=444233.3333333333, ans=0.125 2023-09-29 17:57:48,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:57:49,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=444233.3333333333, ans=0.125 2023-09-29 17:57:50,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 17:57:50,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 17:57:52,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:57:53,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:53,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 17:57:55,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:57:55,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:55,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:57:55,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:57:55,178 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 17:57:56,637 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 17:57:56,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:57:56,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:59,674 INFO [train.py:1039] (2/4) Epoch 13, batch 2900, loss[loss=0.2092, simple_loss=0.2804, pruned_loss=0.06897, over 24090.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2639, pruned_loss=0.05971, over 4718927.61 frames. ], batch size: 86, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:58:01,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:02,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.932e+02 2.253e+02 2.547e+02 3.848e+02, threshold=4.506e+02, percent-clipped=0.0 2023-09-29 17:58:02,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:02,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:04,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 17:58:09,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:10,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 17:58:11,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 17:58:11,868 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:58:13,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:58:13,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:58:16,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:16,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:58:20,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:58:21,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:24,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:58:24,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 17:58:24,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:58:26,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:29,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 17:58:30,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 17:58:33,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:58:33,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 17:58:33,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:58:37,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:58:37,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:37,183 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=444433.3333333333, ans=0.1 2023-09-29 17:58:42,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:42,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:43,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=444433.3333333333, ans=15.0 2023-09-29 17:58:44,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=444433.3333333333, ans=0.1 2023-09-29 17:58:45,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:49,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:49,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=444500.0, ans=0.125 2023-09-29 17:58:53,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 17:58:53,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 17:58:53,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:58:56,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:58:58,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=444500.0, ans=0.0 2023-09-29 17:58:59,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 17:58:59,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:59:00,315 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.19 vs. limit=12.0 2023-09-29 17:59:04,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=444566.6666666667, ans=0.0 2023-09-29 17:59:05,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:59:13,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:59:13,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:59:15,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 17:59:18,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:18,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 17:59:20,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:22,146 INFO [train.py:1039] (2/4) Epoch 13, batch 2950, loss[loss=0.1918, simple_loss=0.2721, pruned_loss=0.05575, over 24520.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2656, pruned_loss=0.06029, over 4719837.11 frames. ], batch size: 66, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:59:22,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:59:29,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:30,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 17:59:31,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:31,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:33,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:59:34,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:59:35,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 17:59:37,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 17:59:37,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:59:37,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:41,482 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-09-29 17:59:42,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:59:43,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:59:45,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:59:47,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:59:49,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:59:49,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:59:50,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:59:54,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 18:00:01,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 18:00:01,533 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 18:00:02,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:00:05,783 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 18:00:05,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 18:00:07,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:00:07,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:00:07,576 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 18:00:07,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:00:10,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 18:00:12,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:00:12,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:00:16,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:18,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:00:18,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:19,620 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 18:00:19,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:19,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 18:00:26,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:27,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=444900.0, ans=10.0 2023-09-29 18:00:27,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:00:28,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 18:00:29,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:00:31,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 18:00:34,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=444900.0, ans=0.1 2023-09-29 18:00:35,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:36,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=444900.0, ans=0.125 2023-09-29 18:00:37,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:00:38,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:00:38,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:38,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:00:40,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:00:40,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:00:40,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:00:42,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:00:42,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:43,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:00:45,219 INFO [train.py:1039] (2/4) Epoch 13, batch 3000, loss[loss=0.1899, simple_loss=0.2704, pruned_loss=0.05472, over 23437.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.266, pruned_loss=0.06048, over 4711996.05 frames. ], batch size: 93, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 18:00:45,220 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 18:00:59,701 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.0008, 4.8541, 4.6617, 4.3163], device='cuda:2') 2023-09-29 18:01:00,595 INFO [train.py:1071] (2/4) Epoch 13, validation: loss=0.3476, simple_loss=0.2869, pruned_loss=0.2041, over 1125622.00 frames. 2023-09-29 18:01:00,596 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21046MB 2023-09-29 18:01:00,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:00,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 18:01:02,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:04,369 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.886e+02 2.154e+02 2.482e+02 3.380e+02, threshold=4.309e+02, percent-clipped=0.0 2023-09-29 18:01:04,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:06,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:01:09,649 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 18:01:09,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 18:01:11,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:01:11,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:01:12,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 18:01:14,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:21,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:01:27,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.27 vs. limit=22.5 2023-09-29 18:01:30,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:01:40,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 18:01:41,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:01:44,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:01:45,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:45,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:01:47,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:01:47,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 18:01:49,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 18:01:50,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:01:50,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:01:53,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.69 vs. limit=15.0 2023-09-29 18:01:53,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:01:53,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:01:55,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:55,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:01:56,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=445166.6666666667, ans=0.125 2023-09-29 18:01:59,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:02:01,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:02:01,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:02:02,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:02:04,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 18:02:05,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:02:05,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:07,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:02:09,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:09,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:11,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:02:11,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 18:02:12,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:02:12,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 18:02:12,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:02:16,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 18:02:16,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=445233.3333333333, ans=0.0 2023-09-29 18:02:19,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:02:20,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:02:21,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 18:02:21,565 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.10 vs. limit=15.0 2023-09-29 18:02:22,402 INFO [train.py:1039] (2/4) Epoch 13, batch 3050, loss[loss=0.2494, simple_loss=0.303, pruned_loss=0.09787, over 19623.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2677, pruned_loss=0.06203, over 4689272.41 frames. ], batch size: 388, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:02:23,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 18:02:23,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:02:24,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:02:25,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:25,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:02:26,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:27,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:02:28,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 18:02:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:02:33,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:33,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:02:38,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:39,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 18:02:45,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 18:02:45,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 18:02:47,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:02:52,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:02:54,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=445433.3333333333, ans=0.0 2023-09-29 18:02:55,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:55,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:56,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:02:58,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=445433.3333333333, ans=0.125 2023-09-29 18:02:59,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:01,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:03:01,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:02,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:03:02,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:02,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:05,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:09,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:09,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 18:03:09,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:09,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:03:09,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=445500.0, ans=0.2 2023-09-29 18:03:12,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:03:13,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:03:14,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:14,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:19,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:20,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:26,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:28,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:03:28,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:28,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:29,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:03:31,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:31,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 18:03:32,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:34,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:34,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 18:03:36,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:42,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:43,544 INFO [train.py:1039] (2/4) Epoch 13, batch 3100, loss[loss=0.1927, simple_loss=0.2675, pruned_loss=0.05898, over 23333.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2679, pruned_loss=0.06207, over 4689332.73 frames. ], batch size: 119, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:03:43,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:03:45,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:03:46,663 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.826e+02 2.024e+02 2.314e+02 3.606e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-29 18:03:47,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 18:03:50,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 18:03:51,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 18:03:52,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:03:56,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:57,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:59,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:04:02,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:02,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=445700.0, ans=0.1 2023-09-29 18:04:07,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 18:04:07,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=445700.0, ans=0.125 2023-09-29 18:04:09,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=445700.0, ans=0.125 2023-09-29 18:04:12,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=445700.0, ans=0.125 2023-09-29 18:04:13,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:04:13,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:14,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:14,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:04:15,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:04:15,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=445766.6666666667, ans=0.125 2023-09-29 18:04:16,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:04:16,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 18:04:16,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:04:20,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:20,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 18:04:21,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:04:24,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=445766.6666666667, ans=0.2 2023-09-29 18:04:24,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=445766.6666666667, ans=0.09899494936611666 2023-09-29 18:04:25,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:04:27,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 18:04:29,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 18:04:29,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:30,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:31,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=445766.6666666667, ans=0.0 2023-09-29 18:04:32,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=445833.3333333333, ans=0.2 2023-09-29 18:04:33,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:33,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:33,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:04:35,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:04:35,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:04:37,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:04:37,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:04:37,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:37,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:04:41,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:41,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 18:04:43,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:04:45,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 18:04:45,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:45,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:46,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 18:04:59,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 18:05:00,179 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:05:02,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=445900.0, ans=0.0 2023-09-29 18:05:03,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:04,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:06,264 INFO [train.py:1039] (2/4) Epoch 13, batch 3150, loss[loss=0.2014, simple_loss=0.2752, pruned_loss=0.06378, over 23748.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2657, pruned_loss=0.06155, over 4673860.84 frames. ], batch size: 85, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:05:06,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:05:06,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:05:08,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 18:05:09,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:09,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:05:11,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 18:05:14,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:15,726 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 18:05:18,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 18:05:18,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:05:20,459 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 18:05:23,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:05:24,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 18:05:25,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 18:05:25,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 18:05:25,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:25,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:27,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:28,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 18:05:30,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:34,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:05:38,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=446100.0, ans=0.125 2023-09-29 18:05:39,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 18:05:39,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:05:41,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:05:43,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:43,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=446100.0, ans=0.0 2023-09-29 18:05:44,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 18:05:47,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 18:05:48,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:05:49,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:05:49,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:05:49,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:49,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:05:51,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:05:51,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:05:51,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=446100.0, ans=0.1 2023-09-29 18:05:52,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 18:05:52,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:05:52,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:05:55,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:05:55,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:57,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 18:05:58,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:00,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 18:06:00,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:02,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 18:06:02,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=446166.6666666667, ans=22.5 2023-09-29 18:06:03,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 18:06:05,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:06:05,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:05,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 18:06:06,356 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.54 vs. limit=15.0 2023-09-29 18:06:07,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 18:06:09,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:06:12,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:06:14,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:14,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:06:14,848 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:06:18,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:06:19,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:19,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=446233.3333333333, ans=0.125 2023-09-29 18:06:21,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 18:06:24,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:06:24,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 18:06:24,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=446233.3333333333, ans=0.0 2023-09-29 18:06:28,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=446300.0, ans=0.125 2023-09-29 18:06:29,119 INFO [train.py:1039] (2/4) Epoch 13, batch 3200, loss[loss=0.1904, simple_loss=0.2689, pruned_loss=0.05601, over 24452.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2649, pruned_loss=0.06101, over 4681036.19 frames. ], batch size: 63, lr: 7.95e-03, grad_scale: 32.0 2023-09-29 18:06:29,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:30,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:06:30,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 18:06:32,602 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.906e+02 2.221e+02 2.638e+02 3.823e+02, threshold=4.442e+02, percent-clipped=0.0 2023-09-29 18:06:34,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:36,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=446300.0, ans=15.0 2023-09-29 18:06:39,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:06:44,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:51,818 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.84 vs. limit=12.0 2023-09-29 18:06:55,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:07:05,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 18:07:05,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:07:09,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 18:07:10,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:07:14,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:07:14,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:07:15,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:07:20,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 18:07:20,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:07:23,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 18:07:26,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 18:07:29,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:07:36,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:36,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:07:36,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:38,336 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 18:07:38,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:07:41,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:07:43,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 18:07:43,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 18:07:43,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=446566.6666666667, ans=0.125 2023-09-29 18:07:45,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 18:07:47,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 18:07:49,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:07:52,558 INFO [train.py:1039] (2/4) Epoch 13, batch 3250, loss[loss=0.2117, simple_loss=0.2678, pruned_loss=0.07782, over 23769.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2649, pruned_loss=0.06043, over 4701106.51 frames. ], batch size: 164, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:07:52,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:07:52,684 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 18:07:52,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:07:52,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:07:54,233 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 18:07:58,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:07:59,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=446633.3333333333, ans=0.125 2023-09-29 18:08:02,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:09,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:08:09,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 18:08:10,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:12,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:08:12,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:13,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:13,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:08:14,571 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.64 vs. limit=12.0 2023-09-29 18:08:17,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:08:17,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:17,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:19,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:08:22,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:24,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:26,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:26,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:28,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:28,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:28,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:08:33,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 18:08:34,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:34,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:08:36,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:36,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:08:38,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=446766.6666666667, ans=0.1 2023-09-29 18:08:38,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=446766.6666666667, ans=0.0 2023-09-29 18:08:44,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:08:53,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:08:54,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:54,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 18:08:54,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:08:54,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:08:54,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:59,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 18:08:59,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 18:09:00,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:09:02,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:02,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=446900.0, ans=0.125 2023-09-29 18:09:03,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:03,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:09:03,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:06,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:06,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:08,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 18:09:08,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:11,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:09:11,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 18:09:14,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:09:14,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 18:09:16,070 INFO [train.py:1039] (2/4) Epoch 13, batch 3300, loss[loss=0.2104, simple_loss=0.2701, pruned_loss=0.07535, over 22703.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2655, pruned_loss=0.06112, over 4688209.61 frames. ], batch size: 322, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:09:16,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 18:09:18,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 18:09:18,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:21,246 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.922e+02 2.153e+02 2.771e+02 4.428e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 18:09:22,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:24,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:09:24,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:26,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:09:27,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:09:29,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:31,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:35,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 18:09:37,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:09:37,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:37,958 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.94 vs. limit=10.0 2023-09-29 18:09:40,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:40,292 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 18:09:41,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:09:43,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:09:43,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:09:43,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:09:43,312 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 18:09:47,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:49,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:09:52,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:52,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 18:09:54,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:09:54,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:55,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:09:57,501 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 18:09:57,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=447100.0, ans=0.125 2023-09-29 18:09:59,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 18:09:59,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:09:59,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=447100.0, ans=0.125 2023-09-29 18:10:02,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 18:10:05,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:08,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:10:09,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:11,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:12,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:12,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:10:12,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:10:15,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:10:16,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:17,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:10:18,960 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 18:10:20,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 18:10:22,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:10:23,047 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.59 vs. limit=6.0 2023-09-29 18:10:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:10:23,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:25,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:10:25,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=447233.3333333333, ans=0.07 2023-09-29 18:10:27,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:27,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:10:28,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:31,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:10:34,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 18:10:34,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:36,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:36,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:10:36,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:38,576 INFO [train.py:1039] (2/4) Epoch 13, batch 3350, loss[loss=0.1841, simple_loss=0.2573, pruned_loss=0.05542, over 24558.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2665, pruned_loss=0.06137, over 4691978.21 frames. ], batch size: 60, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:10:38,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:41,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:41,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:46,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:50,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:51,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:10:53,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:55,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:10:55,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=447366.6666666667, ans=0.95 2023-09-29 18:10:56,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:58,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:10:59,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 18:11:01,135 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 18:11:02,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:11:03,304 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.79 vs. limit=15.0 2023-09-29 18:11:04,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 18:11:04,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 18:11:06,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:11:06,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:11:07,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:08,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 18:11:08,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:08,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=447366.6666666667, ans=0.0 2023-09-29 18:11:09,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:11:11,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:12,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:14,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:14,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:11:15,467 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.22 vs. limit=22.5 2023-09-29 18:11:18,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:21,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:21,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:21,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=447433.3333333333, ans=0.125 2023-09-29 18:11:26,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:11:28,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:29,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:29,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:31,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:33,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 18:11:33,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:11:33,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 18:11:34,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:11:34,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 18:11:36,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:37,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:44,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:46,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 18:11:46,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:11:48,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:11:50,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:11:57,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:11:58,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 18:12:00,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:12:01,650 INFO [train.py:1039] (2/4) Epoch 13, batch 3400, loss[loss=0.1766, simple_loss=0.2557, pruned_loss=0.04869, over 24410.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2672, pruned_loss=0.0613, over 4700571.59 frames. ], batch size: 63, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:12:01,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:12:03,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:03,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 18:12:03,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:03,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 18:12:06,368 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 1.928e+02 2.132e+02 2.448e+02 3.305e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 18:12:06,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:06,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:08,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:12:08,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:12:08,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 18:12:13,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 18:12:13,260 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 18:12:13,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:18,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:12:18,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:12:20,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:20,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:12:25,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:28,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 18:12:30,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=447700.0, ans=0.125 2023-09-29 18:12:32,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=447700.0, ans=0.09899494936611666 2023-09-29 18:12:34,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:12:37,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:37,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:38,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 18:12:43,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:12:49,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 18:12:54,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:54,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:56,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 18:12:56,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:56,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:58,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:59,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:13:02,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:13:05,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.79 vs. limit=22.5 2023-09-29 18:13:06,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:13:06,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:13:11,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:14,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 18:13:19,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:13:23,692 INFO [train.py:1039] (2/4) Epoch 13, batch 3450, loss[loss=0.1822, simple_loss=0.2662, pruned_loss=0.04911, over 24658.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2687, pruned_loss=0.06174, over 4700634.42 frames. ], batch size: 68, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:13:23,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 18:13:28,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 18:13:28,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:13:29,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.56 vs. limit=22.5 2023-09-29 18:13:30,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:13:30,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 18:13:32,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:37,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:13:40,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:13:41,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:13:43,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:13:43,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:45,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:52,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 18:13:58,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 18:13:58,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:14:00,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:14:00,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:09,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 18:14:09,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:14:12,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:12,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:14:15,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:14:16,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:14:17,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=448166.6666666667, ans=0.125 2023-09-29 18:14:19,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 18:14:19,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:19,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:14:22,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:14:25,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 18:14:27,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=448166.6666666667, ans=0.1 2023-09-29 18:14:28,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:14:32,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=448233.3333333333, ans=0.125 2023-09-29 18:14:33,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:14:33,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.45 vs. limit=15.0 2023-09-29 18:14:34,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:37,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:37,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=448233.3333333333, ans=0.0 2023-09-29 18:14:42,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:42,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:43,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=448233.3333333333, ans=0.2 2023-09-29 18:14:44,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:14:44,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:48,036 INFO [train.py:1039] (2/4) Epoch 13, batch 3500, loss[loss=0.1821, simple_loss=0.2461, pruned_loss=0.05905, over 23964.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2664, pruned_loss=0.06129, over 4697807.05 frames. ], batch size: 196, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:14:49,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:52,151 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.24 vs. limit=15.0 2023-09-29 18:14:52,616 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.904e+02 2.170e+02 2.519e+02 3.488e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-29 18:14:52,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:14:54,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 18:14:55,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:14:59,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:15:02,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:15:02,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 18:15:08,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:15:09,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:15:10,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:15:10,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:10,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:15:11,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:12,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:12,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 18:15:15,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:15,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:15:17,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:20,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:22,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 18:15:22,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:25,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:27,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:15:27,667 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:15:27,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=448433.3333333333, ans=0.125 2023-09-29 18:15:28,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:30,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:15:31,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:33,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 18:15:33,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 18:15:34,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 18:15:34,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:37,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:37,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:37,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:15:41,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:15:41,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:15:47,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:15:49,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 18:15:49,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 18:15:49,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:15:52,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:15:52,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:15:54,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:57,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 18:15:57,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:16:00,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:16:01,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 18:16:03,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 18:16:05,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:06,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:16:06,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:06,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:09,893 INFO [train.py:1039] (2/4) Epoch 13, batch 3550, loss[loss=0.1854, simple_loss=0.2469, pruned_loss=0.06197, over 23608.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2644, pruned_loss=0.06086, over 4691823.61 frames. ], batch size: 256, lr: 7.92e-03, grad_scale: 16.0 2023-09-29 18:16:10,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:16:20,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:22,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 18:16:23,197 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.79 vs. limit=15.0 2023-09-29 18:16:26,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:16:27,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:16:29,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:29,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=448700.0, ans=0.125 2023-09-29 18:16:30,334 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-09-29 18:16:31,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:16:31,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:16:35,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:35,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:16:35,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:35,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:16:37,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:16:37,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=448700.0, ans=0.2 2023-09-29 18:16:40,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=448700.0, ans=0.1 2023-09-29 18:16:43,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:16:43,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:45,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:16:45,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:45,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:16:46,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 18:16:46,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:49,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:51,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:16:57,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:58,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:17:00,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:02,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 18:17:02,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:17:02,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 18:17:03,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:17:05,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:17:05,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:17:07,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=448833.3333333333, ans=0.125 2023-09-29 18:17:08,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 18:17:10,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:16,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:16,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 18:17:18,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:18,706 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.24 vs. limit=15.0 2023-09-29 18:17:21,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:17:24,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 18:17:26,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=448900.0, ans=0.125 2023-09-29 18:17:30,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 18:17:30,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:17:31,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:17:33,852 INFO [train.py:1039] (2/4) Epoch 13, batch 3600, loss[loss=0.1749, simple_loss=0.2434, pruned_loss=0.05322, over 24291.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2646, pruned_loss=0.06062, over 4695799.52 frames. ], batch size: 61, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:17:35,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:17:39,127 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.817e+02 2.056e+02 2.414e+02 4.361e+02, threshold=4.112e+02, percent-clipped=1.0 2023-09-29 18:17:40,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:43,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:44,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:17:45,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:17:45,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:45,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 18:17:48,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:17:48,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:51,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:17:55,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:17:56,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:17:56,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:58,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 18:17:58,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:18:01,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:18:01,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:18:03,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:03,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=449033.3333333333, ans=0.0 2023-09-29 18:18:07,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:18:07,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:07,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=449100.0, ans=0.125 2023-09-29 18:18:08,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 18:18:14,568 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.23 vs. limit=12.0 2023-09-29 18:18:16,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:18,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:18:18,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 18:18:23,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:18:25,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-09-29 18:18:28,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:30,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=449166.6666666667, ans=0.125 2023-09-29 18:18:30,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=449166.6666666667, ans=0.2 2023-09-29 18:18:31,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:37,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:18:37,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:18:37,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 18:18:38,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 18:18:40,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 18:18:42,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:44,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:18:45,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 18:18:45,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:18:47,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:18:47,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:47,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 18:18:49,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 18:18:52,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:53,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 18:18:56,386 INFO [train.py:1039] (2/4) Epoch 13, batch 3650, loss[loss=0.2186, simple_loss=0.2804, pruned_loss=0.07838, over 23888.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2655, pruned_loss=0.06125, over 4693052.26 frames. ], batch size: 196, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:18:58,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 18:18:59,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:19:03,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 18:19:05,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 18:19:11,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:11,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:19:13,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:19:15,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=449366.6666666667, ans=0.125 2023-09-29 18:19:16,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:19:16,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:19:16,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 18:19:18,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:19:18,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:20,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 18:19:20,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:19:20,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:19:22,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:24,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:19:28,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 18:19:28,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 18:19:29,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:19:31,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 18:19:33,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:33,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:19:34,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=449433.3333333333, ans=0.0 2023-09-29 18:19:39,060 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.41 vs. limit=12.0 2023-09-29 18:19:39,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:19:41,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:41,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:19:43,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:19:43,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:19:45,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:19:48,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:49,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:51,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:53,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:19:53,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=449500.0, ans=0.0 2023-09-29 18:19:54,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:56,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:02,029 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 18:20:05,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:05,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:06,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:20:06,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:08,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:20:09,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:11,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 18:20:11,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:15,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:20:16,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:20:17,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:20:20,446 INFO [train.py:1039] (2/4) Epoch 13, batch 3700, loss[loss=0.2036, simple_loss=0.2846, pruned_loss=0.06128, over 24462.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2665, pruned_loss=0.06119, over 4700298.00 frames. ], batch size: 69, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:20:20,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:20,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 18:20:20,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:22,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:20:22,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:20:25,573 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.943e+02 2.154e+02 2.473e+02 4.046e+02, threshold=4.307e+02, percent-clipped=0.0 2023-09-29 18:20:25,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:20:30,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:32,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:33,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:20:33,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:34,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.85 vs. limit=15.0 2023-09-29 18:20:35,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:20:38,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:39,779 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 18:20:49,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:20:49,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:20:50,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:20:50,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 18:20:51,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:20:54,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:56,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 18:20:57,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:58,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:20:58,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.25 vs. limit=15.0 2023-09-29 18:21:01,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:02,227 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.96 vs. limit=15.0 2023-09-29 18:21:02,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:21:04,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:21:08,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:21:08,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 18:21:09,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:21:10,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 18:21:13,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=449833.3333333333, ans=0.05 2023-09-29 18:21:16,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:21:17,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:21:17,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=449833.3333333333, ans=0.0 2023-09-29 18:21:20,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:20,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 18:21:23,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:21:23,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:21:23,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:23,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:24,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=449900.0, ans=0.125 2023-09-29 18:21:27,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:29,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 18:21:31,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 18:21:31,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:21:31,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:31,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=449900.0, ans=0.0 2023-09-29 18:21:32,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:21:34,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:21:37,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:39,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:21:40,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:21:42,135 INFO [train.py:1039] (2/4) Epoch 13, batch 3750, loss[loss=0.1937, simple_loss=0.2686, pruned_loss=0.05939, over 23366.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2668, pruned_loss=0.06089, over 4713967.73 frames. ], batch size: 106, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:21:42,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 18:21:44,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:21:45,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:21:47,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 18:21:47,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:21:49,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:50,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:52,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=449966.6666666667, ans=0.2 2023-09-29 18:21:53,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:21:57,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:01,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:22:01,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:22:04,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:22:08,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:08,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 18:22:10,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:12,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:12,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:15,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 18:22:18,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 18:22:20,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:21,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:23,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:29,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:31,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:22:34,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 18:22:39,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:43,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:22:44,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:22:47,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:22:51,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:22:52,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:22:55,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:22:57,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:22:59,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:23:03,487 INFO [train.py:1039] (2/4) Epoch 13, batch 3800, loss[loss=0.1886, simple_loss=0.2612, pruned_loss=0.05799, over 24465.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.266, pruned_loss=0.06046, over 4712385.31 frames. ], batch size: 63, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:23:06,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:23:08,363 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 1.938e+02 2.125e+02 2.387e+02 3.006e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 18:23:12,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:12,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:23:13,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 18:23:13,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:17,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:19,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:23:22,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:23:22,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:23,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:23:25,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:25,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:23:25,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:26,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 18:23:28,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=450366.6666666667, ans=0.2 2023-09-29 18:23:29,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 18:23:31,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:23:36,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:39,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:23:39,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:23:41,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:23:41,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:42,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:43,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:44,850 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=450433.3333333333, ans=0.1 2023-09-29 18:23:48,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:23:48,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 18:23:51,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:23:57,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:23:58,589 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.04 vs. limit=15.0 2023-09-29 18:24:03,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:05,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 18:24:05,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 18:24:07,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:08,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:24:10,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:10,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 18:24:12,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=450566.6666666667, ans=0.0 2023-09-29 18:24:13,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 18:24:13,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 18:24:15,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:15,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:24:17,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=450566.6666666667, ans=0.0 2023-09-29 18:24:23,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:24:24,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:24:24,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=450633.3333333333, ans=0.2 2023-09-29 18:24:25,913 INFO [train.py:1039] (2/4) Epoch 13, batch 3850, loss[loss=0.2022, simple_loss=0.279, pruned_loss=0.06273, over 24424.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2651, pruned_loss=0.06037, over 4703785.57 frames. ], batch size: 77, lr: 7.91e-03, grad_scale: 16.0 2023-09-29 18:24:26,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=450633.3333333333, ans=0.0 2023-09-29 18:24:27,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:24:29,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 18:24:29,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:24:30,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:35,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:24:37,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:40,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:24:42,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 18:24:48,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:50,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:52,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:24:53,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:24:55,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:57,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:57,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:57,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:24:58,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:01,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:03,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:03,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:25:03,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 18:25:03,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 18:25:04,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:05,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:08,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=450766.6666666667, ans=0.0 2023-09-29 18:25:09,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:09,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:09,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 18:25:11,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 18:25:13,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:16,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 18:25:19,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:25:22,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:24,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:29,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:29,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 18:25:31,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=450900.0, ans=0.2 2023-09-29 18:25:32,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 18:25:35,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:35,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:38,088 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.94 vs. limit=12.0 2023-09-29 18:25:38,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:25:38,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:25:40,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:25:40,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 18:25:42,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:43,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 18:25:43,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:43,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:47,078 INFO [train.py:1039] (2/4) Epoch 13, batch 3900, loss[loss=0.1967, simple_loss=0.2752, pruned_loss=0.05907, over 24453.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2643, pruned_loss=0.06026, over 4716578.17 frames. ], batch size: 69, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:25:47,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:25:47,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:48,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:25:50,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:50,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:51,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:25:51,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 18:25:53,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:54,605 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.889e+02 2.168e+02 2.543e+02 3.582e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 18:25:56,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:25:57,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:25:57,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:25:57,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:26:03,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:26:04,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:06,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:26:07,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 18:26:07,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:08,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=451033.3333333333, ans=0.1 2023-09-29 18:26:09,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 18:26:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:11,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 18:26:12,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 18:26:18,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:20,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:26:20,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:26:22,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:26:23,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=451100.0, ans=0.1 2023-09-29 18:26:24,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=451100.0, ans=0.07 2023-09-29 18:26:25,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:26,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:26:29,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:26:29,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:26:30,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=451100.0, ans=0.1 2023-09-29 18:26:31,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:26:31,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=451100.0, ans=0.125 2023-09-29 18:26:33,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=451166.6666666667, ans=0.04949747468305833 2023-09-29 18:26:34,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=451166.6666666667, ans=0.125 2023-09-29 18:26:36,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:36,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:26:42,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:26:44,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:26:47,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=451166.6666666667, ans=0.125 2023-09-29 18:26:51,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=451233.3333333333, ans=0.125 2023-09-29 18:26:55,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=451233.3333333333, ans=0.2 2023-09-29 18:26:57,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:27:00,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:00,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 18:27:01,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 18:27:01,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:02,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 18:27:03,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:27:06,303 INFO [train.py:1039] (2/4) Epoch 13, batch 3950, loss[loss=0.1829, simple_loss=0.2499, pruned_loss=0.05794, over 23457.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2642, pruned_loss=0.05961, over 4716751.37 frames. ], batch size: 285, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:27:06,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 18:27:12,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:27:14,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 18:27:14,830 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.16 vs. limit=15.0 2023-09-29 18:27:15,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:27:17,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:27:18,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:27:19,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=451300.0, ans=0.2 2023-09-29 18:27:23,636 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 18:27:25,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:25,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 18:27:25,152 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 18:27:25,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:27:28,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:29,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:27:29,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:32,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 18:27:35,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:27:35,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:35,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:27:36,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:27:36,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:27:50,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:27:50,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:27:57,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 18:28:05,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 18:28:05,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 18:28:05,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:28:05,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:28:10,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=451500.0, ans=0.015 2023-09-29 18:28:13,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:28:13,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:28:13,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:28:13,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:28:14,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 18:28:20,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:28:22,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:28:25,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 18:28:27,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=451566.6666666667, ans=0.1 2023-09-29 18:28:30,266 INFO [train.py:1039] (2/4) Epoch 13, batch 4000, loss[loss=0.2015, simple_loss=0.2685, pruned_loss=0.06727, over 23752.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2649, pruned_loss=0.05998, over 4707887.17 frames. ], batch size: 149, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:28:32,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=451633.3333333333, ans=0.125 2023-09-29 18:28:37,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:38,699 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.910e+02 2.120e+02 2.727e+02 3.930e+02, threshold=4.239e+02, percent-clipped=0.0 2023-09-29 18:28:43,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:48,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:49,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:28:49,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:51,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 18:28:51,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:28:53,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 18:28:53,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:28:53,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 18:28:55,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:58,228 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.83 vs. limit=15.0 2023-09-29 18:28:58,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:28:58,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:28:58,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:29:00,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:00,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:29:01,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:29:03,295 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 18:29:04,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:29:04,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:07,962 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 18:29:09,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:29:09,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:16,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 18:29:17,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:19,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:29:20,906 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 18:29:23,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:29:23,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 18:29:25,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:29:27,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:27,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:29:28,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:29:30,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:29:30,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:32,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 18:29:33,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:35,552 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 18:29:35,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=451900.0, ans=0.0 2023-09-29 18:29:40,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:29:43,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:29:45,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=451900.0, ans=0.2 2023-09-29 18:29:46,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:29:46,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:48,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:29:48,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:29:48,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=451900.0, ans=0.1 2023-09-29 18:29:50,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=451966.6666666667, ans=0.125 2023-09-29 18:29:51,470 INFO [train.py:1039] (2/4) Epoch 13, batch 4050, loss[loss=0.1986, simple_loss=0.2783, pruned_loss=0.05944, over 24553.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2651, pruned_loss=0.06007, over 4708855.71 frames. ], batch size: 71, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:29:54,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:56,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:29:56,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 18:29:59,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:29:59,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:01,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:30:03,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:03,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=451966.6666666667, ans=0.2 2023-09-29 18:30:04,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:08,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:10,856 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.95 vs. limit=15.0 2023-09-29 18:30:11,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:13,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:30:14,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:30:16,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:30:19,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:19,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=452033.3333333333, ans=0.035 2023-09-29 18:30:20,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:22,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 18:30:22,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=452100.0, ans=0.0 2023-09-29 18:30:25,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 18:30:25,564 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 18:30:28,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:30:35,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 18:30:37,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:30:40,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:43,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:44,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:30:44,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:46,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:50,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 18:30:50,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:30:52,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:30:54,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 18:30:58,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:30:59,464 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-09-29 18:31:00,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=452233.3333333333, ans=0.125 2023-09-29 18:31:07,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 18:31:07,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:07,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:31:07,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=452233.3333333333, ans=0.125 2023-09-29 18:31:10,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 18:31:12,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 18:31:12,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:14,150 INFO [train.py:1039] (2/4) Epoch 13, batch 4100, loss[loss=0.1649, simple_loss=0.2428, pruned_loss=0.04351, over 24633.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.266, pruned_loss=0.06057, over 4709472.41 frames. ], batch size: 60, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:31:14,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:14,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:16,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:31:19,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=452300.0, ans=0.125 2023-09-29 18:31:22,209 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.980e+02 2.231e+02 2.743e+02 3.910e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 18:31:22,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 18:31:22,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=452300.0, ans=0.1 2023-09-29 18:31:25,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 18:31:27,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 18:31:28,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 18:31:28,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:29,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:31:30,160 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 18:31:33,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:34,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:31:34,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:38,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:31:38,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=452366.6666666667, ans=0.125 2023-09-29 18:31:41,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:31:43,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:43,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:31:43,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 18:31:43,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:43,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:31:43,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:45,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:31:45,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 18:31:48,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:31:49,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=452433.3333333333, ans=0.125 2023-09-29 18:31:50,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 18:31:52,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:55,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:55,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 18:31:56,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:57,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:31:58,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:32:00,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 18:32:00,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:32:01,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:32:04,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 18:32:04,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:06,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:06,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=452500.0, ans=0.125 2023-09-29 18:32:09,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:15,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:16,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:19,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:32:27,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:27,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:31,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:34,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:32:36,416 INFO [train.py:1039] (2/4) Epoch 13, batch 4150, loss[loss=0.1915, simple_loss=0.2774, pruned_loss=0.05276, over 24556.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.266, pruned_loss=0.06032, over 4711193.02 frames. ], batch size: 71, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:32:38,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:38,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:32:39,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:32:39,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:32:41,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 18:32:42,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:44,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 18:32:46,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 18:32:46,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 18:32:48,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:53,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:32:53,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:57,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:59,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:00,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:33:02,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:33:02,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:33:02,585 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:33:03,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:33:08,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:13,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:14,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 18:33:15,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.06 vs. limit=15.0 2023-09-29 18:33:16,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.92 vs. limit=8.0 2023-09-29 18:33:16,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 18:33:16,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:33:18,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 18:33:18,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:33:18,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:21,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:21,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:25,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 18:33:26,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=452833.3333333333, ans=0.1 2023-09-29 18:33:26,588 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.97 vs. limit=15.0 2023-09-29 18:33:29,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:31,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:33:31,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 18:33:33,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:33,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 18:33:37,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:33:39,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:40,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:42,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 18:33:42,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:42,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:33:43,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:33:45,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 18:33:45,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:45,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:33:47,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:33:48,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 18:33:48,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:48,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:33:50,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:51,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:51,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 18:33:51,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:59,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:34:00,822 INFO [train.py:1039] (2/4) Epoch 13, batch 4200, loss[loss=0.2102, simple_loss=0.2738, pruned_loss=0.07325, over 23854.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2653, pruned_loss=0.06083, over 4695160.56 frames. ], batch size: 179, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:34:01,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 18:34:04,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:34:06,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:07,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:34:09,028 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.933e+02 2.202e+02 2.518e+02 4.955e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-29 18:34:09,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:09,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:10,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 18:34:15,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 18:34:15,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:17,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:21,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:34:24,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:34:26,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:34:26,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:28,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 18:34:28,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:28,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:30,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:30,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:34:31,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:34:35,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 18:34:35,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:37,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=453100.0, ans=0.125 2023-09-29 18:34:39,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:34:41,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:34:44,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:34:44,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:34:46,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=453100.0, ans=0.0 2023-09-29 18:34:47,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:34:47,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 18:34:47,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:34:49,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:34:55,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:34:56,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:34:58,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff3.min_abs, batch_count=453166.6666666667, ans=0.2 2023-09-29 18:35:02,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:35:05,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 18:35:07,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:35:13,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:35:14,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=5.11 vs. limit=5.0 2023-09-29 18:35:14,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:17,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 18:35:19,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=453233.3333333333, ans=0.0 2023-09-29 18:35:22,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:35:23,568 INFO [train.py:1039] (2/4) Epoch 13, batch 4250, loss[loss=0.1514, simple_loss=0.2317, pruned_loss=0.03549, over 24308.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2641, pruned_loss=0.06017, over 4703539.09 frames. ], batch size: 56, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:35:25,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:35:25,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:35:27,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:37,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:35:37,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 18:35:37,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:35:40,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:43,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:35:48,371 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.55 vs. limit=15.0 2023-09-29 18:35:49,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:49,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:51,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:35:51,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:35:52,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:52,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:54,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:57,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:35:58,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:00,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 18:36:03,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 18:36:03,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:04,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:04,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:36:06,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:36:06,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:06,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:11,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:36:11,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:36:18,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:18,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:21,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 18:36:21,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:36:22,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 18:36:24,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:36:25,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:36:28,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:28,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:36:28,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 18:36:30,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:36:31,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:36:36,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:38,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:39,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:36:42,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:42,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:43,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:36:45,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:36:45,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 18:36:46,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:47,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=453566.6666666667, ans=0.0 2023-09-29 18:36:50,400 INFO [train.py:1039] (2/4) Epoch 13, batch 4300, loss[loss=0.1971, simple_loss=0.2828, pruned_loss=0.05575, over 24674.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2638, pruned_loss=0.05991, over 4708924.74 frames. ], batch size: 68, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:36:52,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:52,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:36:54,841 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.43 vs. limit=15.0 2023-09-29 18:36:56,413 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.67 vs. limit=6.0 2023-09-29 18:36:57,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:58,705 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 2.045e+02 2.305e+02 2.959e+02 4.581e+02, threshold=4.610e+02, percent-clipped=2.0 2023-09-29 18:37:04,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:37:04,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 18:37:06,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:37:08,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:37:08,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:37:08,215 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 18:37:11,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:37:14,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:19,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 18:37:19,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:37:20,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 18:37:22,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:37:23,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:37:29,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:37:29,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:37:31,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:37:31,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:33,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:37:33,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 18:37:34,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 18:37:35,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=453766.6666666667, ans=0.125 2023-09-29 18:37:37,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:37:38,814 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.00 vs. limit=10.0 2023-09-29 18:37:41,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:41,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:37:41,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:41,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:41,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 18:37:41,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 18:37:42,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 18:37:42,891 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:37:44,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:37:44,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 18:37:44,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 18:37:47,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:48,982 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 18:37:50,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:37:52,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:37:52,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:54,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 18:37:55,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:55,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:55,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:37:55,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:37:57,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:37:58,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:38:02,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:04,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:04,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:38:09,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 18:38:10,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:38:11,800 INFO [train.py:1039] (2/4) Epoch 13, batch 4350, loss[loss=0.2043, simple_loss=0.2908, pruned_loss=0.05891, over 24335.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.265, pruned_loss=0.06038, over 4703025.30 frames. ], batch size: 74, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:38:15,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:18,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:19,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=453966.6666666667, ans=0.2 2023-09-29 18:38:21,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:38:21,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:38:25,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:38:31,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:32,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:38:32,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:38:36,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:38:39,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:38:40,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:38:46,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 18:38:48,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:48,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:48,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=454100.0, ans=0.125 2023-09-29 18:38:53,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=454100.0, ans=0.05 2023-09-29 18:38:54,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:57,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 18:39:00,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:02,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:39:03,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.88 vs. limit=15.0 2023-09-29 18:39:08,797 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 18:39:08,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:10,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:39:10,503 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 18:39:12,606 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 18:39:12,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:14,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:16,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:39:16,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:19,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:19,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:39:22,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 18:39:22,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:22,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:22,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:23,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 18:39:25,284 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 18:39:25,291 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 18:39:25,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 18:39:28,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:39:29,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:39:29,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:31,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:39:32,961 INFO [train.py:1039] (2/4) Epoch 13, batch 4400, loss[loss=0.2203, simple_loss=0.2993, pruned_loss=0.07069, over 23980.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.266, pruned_loss=0.06075, over 4710235.82 frames. ], batch size: 80, lr: 7.88e-03, grad_scale: 32.0 2023-09-29 18:39:33,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 18:39:35,985 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 18:39:35,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:39,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:39,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:39,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=454300.0, ans=0.0 2023-09-29 18:39:41,132 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.889e+02 2.108e+02 2.568e+02 3.749e+02, threshold=4.217e+02, percent-clipped=0.0 2023-09-29 18:39:42,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:44,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 18:39:44,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 18:39:44,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 18:39:46,672 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 18:39:46,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:39:46,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:50,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 18:39:51,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:51,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:51,781 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 18:39:55,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:55,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 18:39:55,385 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 18:39:58,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 18:39:59,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 18:39:59,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 18:39:59,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:01,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:04,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 18:40:04,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 18:40:04,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:06,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=454433.3333333333, ans=0.0 2023-09-29 18:40:07,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:40:07,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:09,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:09,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:09,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 18:40:10,768 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 18:40:13,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=454433.3333333333, ans=0.05 2023-09-29 18:40:14,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:24,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:28,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 18:40:33,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:40:34,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.97 vs. limit=22.5 2023-09-29 18:40:34,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:37,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:40:37,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 18:40:37,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:40:37,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:40:37,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:40:39,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:40:41,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 18:40:44,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 18:40:45,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 18:40:45,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:45,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 18:40:47,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:40:51,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:40:54,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 18:40:56,114 INFO [train.py:1039] (2/4) Epoch 13, batch 4450, loss[loss=0.1922, simple_loss=0.2806, pruned_loss=0.05193, over 24670.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2675, pruned_loss=0.06188, over 4703639.82 frames. ], batch size: 73, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:40:56,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:58,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:58,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:41:01,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=454633.3333333333, ans=0.125 2023-09-29 18:41:06,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:06,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:41:12,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:13,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:41:16,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:41:17,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:18,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 18:41:18,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:18,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:18,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:41:18,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:41:20,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=454700.0, ans=0.1 2023-09-29 18:41:21,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:41:27,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:28,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:29,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:31,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:31,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:41:35,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:41:37,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 18:41:37,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 18:41:37,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:41:41,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:42,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 18:41:45,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:41:50,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:50,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 18:41:50,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:50,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:41:50,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:41:50,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:53,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:55,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:41:57,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 18:41:59,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:42:02,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:04,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:42:05,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:07,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:42:07,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:42:10,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 18:42:10,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=454900.0, ans=0.2 2023-09-29 18:42:13,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:42:17,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:18,826 INFO [train.py:1039] (2/4) Epoch 13, batch 4500, loss[loss=0.2243, simple_loss=0.2926, pruned_loss=0.07798, over 23619.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2681, pruned_loss=0.06276, over 4683141.18 frames. ], batch size: 93, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:42:18,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 18:42:18,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 18:42:19,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=454966.6666666667, ans=0.125 2023-09-29 18:42:20,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:20,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=454966.6666666667, ans=0.0 2023-09-29 18:42:25,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:26,526 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 2.036e+02 2.219e+02 2.497e+02 4.181e+02, threshold=4.438e+02, percent-clipped=0.0 2023-09-29 18:42:28,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:42:28,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:42:30,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:30,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:43,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:43,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:42:47,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:49,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:42:51,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:42:57,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:43:02,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:43:07,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:43:08,885 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.39 vs. limit=10.0 2023-09-29 18:43:11,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:43:12,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 18:43:13,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:13,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:43:17,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:43:17,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 18:43:17,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:43:17,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:22,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:43:22,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:43:26,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:29,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:43:29,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:43:31,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 18:43:32,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 18:43:32,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 18:43:38,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 18:43:40,819 INFO [train.py:1039] (2/4) Epoch 13, batch 4550, loss[loss=0.1795, simple_loss=0.2542, pruned_loss=0.05242, over 24304.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2669, pruned_loss=0.06146, over 4702372.44 frames. ], batch size: 56, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:43:41,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 18:43:44,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:43:46,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:47,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:49,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:43:54,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:43:55,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:58,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:43:58,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:43:58,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:00,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:02,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:44:02,793 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=455366.6666666667, ans=0.09899494936611666 2023-09-29 18:44:04,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:07,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 18:44:08,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 18:44:10,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:44:11,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 18:44:13,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=455433.3333333333, ans=0.2 2023-09-29 18:44:16,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 18:44:16,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:16,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=455433.3333333333, ans=0.0 2023-09-29 18:44:20,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 18:44:23,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:44:25,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:44:25,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=455433.3333333333, ans=0.0 2023-09-29 18:44:28,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 18:44:31,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:33,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=455500.0, ans=0.09899494936611666 2023-09-29 18:44:34,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:34,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:37,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:37,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 18:44:39,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 18:44:39,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:44:39,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 18:44:43,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 18:44:43,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:45,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:45,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:46,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:47,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:44:48,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:44:50,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 18:44:52,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:54,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:44:54,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 18:44:54,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:44:54,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 18:44:54,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=455566.6666666667, ans=0.0 2023-09-29 18:44:57,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:44:57,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:45:00,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:45:00,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:45:01,774 INFO [train.py:1039] (2/4) Epoch 13, batch 4600, loss[loss=0.2039, simple_loss=0.2833, pruned_loss=0.06223, over 24621.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2656, pruned_loss=0.06063, over 4721496.99 frames. ], batch size: 68, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:45:01,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:45:03,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:45:03,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=455633.3333333333, ans=0.1 2023-09-29 18:45:05,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:45:07,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:07,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:45:10,372 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.841e+02 2.065e+02 2.321e+02 3.867e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-29 18:45:10,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:45:10,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:45:12,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:13,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 18:45:13,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=455633.3333333333, ans=0.125 2023-09-29 18:45:15,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:45:19,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:45:21,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:22,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:30,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 18:45:31,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:34,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:38,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:45:38,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:44,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 18:45:44,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:45:44,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:45:49,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:49,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:45:51,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:45:54,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 18:45:55,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=455833.3333333333, ans=0.0 2023-09-29 18:45:57,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:46:00,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=455833.3333333333, ans=0.2 2023-09-29 18:46:01,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:03,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:05,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:05,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 18:46:06,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:06,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 18:46:06,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:08,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:09,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:09,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:46:11,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:11,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 18:46:13,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 18:46:13,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 18:46:13,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:15,771 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.87 vs. limit=22.5 2023-09-29 18:46:16,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:16,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:16,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:24,434 INFO [train.py:1039] (2/4) Epoch 13, batch 4650, loss[loss=0.1987, simple_loss=0.243, pruned_loss=0.07724, over 18882.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2647, pruned_loss=0.06046, over 4711720.24 frames. ], batch size: 389, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:46:24,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:46:28,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:28,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:28,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:46:28,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:30,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:30,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:33,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=455966.6666666667, ans=0.125 2023-09-29 18:46:35,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 18:46:39,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:46:43,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 18:46:43,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:43,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 18:46:43,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:46:44,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 18:46:44,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 18:46:46,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:46,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:46:46,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=456033.3333333333, ans=0.125 2023-09-29 18:46:49,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:46:50,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:52,425 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 18:46:54,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:55,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 18:46:58,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:58,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:47:00,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 18:47:00,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:04,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:47:09,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:11,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=456100.0, ans=0.125 2023-09-29 18:47:13,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:15,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:16,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:18,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:47:20,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 18:47:21,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 18:47:22,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 18:47:22,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 18:47:23,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:29,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:47:29,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:31,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 18:47:31,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:31,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:31,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:47:33,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:47:35,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=456233.3333333333, ans=0.125 2023-09-29 18:47:36,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:47:37,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:39,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:40,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:42,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:47:42,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:47:42,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 18:47:43,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:47:45,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 18:47:46,862 INFO [train.py:1039] (2/4) Epoch 13, batch 4700, loss[loss=0.1911, simple_loss=0.2723, pruned_loss=0.05493, over 24358.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2653, pruned_loss=0.06052, over 4715962.55 frames. ], batch size: 77, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:47:52,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:53,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:54,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:55,240 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.032e+02 2.349e+02 2.827e+02 4.344e+02, threshold=4.699e+02, percent-clipped=1.0 2023-09-29 18:47:55,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:56,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:48:02,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 18:48:02,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 18:48:06,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:08,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:48:08,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:48:11,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:17,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=456366.6666666667, ans=0.125 2023-09-29 18:48:19,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:48:20,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:48:23,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:48:31,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 18:48:33,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:48:36,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:39,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 18:48:41,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:48:46,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:48:46,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 18:48:48,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:48,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:51,523 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=456500.0, ans=0.2 2023-09-29 18:48:52,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:52,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:48:54,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 18:48:55,736 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 18:48:55,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:59,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 18:49:00,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:49:04,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 18:49:06,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=456566.6666666667, ans=0.2 2023-09-29 18:49:07,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:49:07,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:10,444 INFO [train.py:1039] (2/4) Epoch 13, batch 4750, loss[loss=0.198, simple_loss=0.2684, pruned_loss=0.06381, over 23720.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2659, pruned_loss=0.06046, over 4721503.82 frames. ], batch size: 232, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:49:12,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:13,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:49:15,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 18:49:15,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:18,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 18:49:20,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:49:20,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:49:22,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:29,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 18:49:33,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:49:34,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 18:49:36,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:39,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:40,930 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 18:49:40,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 18:49:48,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 18:49:51,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:53,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:49:56,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:49:56,704 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 18:49:56,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:49:57,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=456766.6666666667, ans=0.125 2023-09-29 18:49:59,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:50:02,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:50:02,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=456833.3333333333, ans=0.125 2023-09-29 18:50:04,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 18:50:05,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 18:50:06,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:06,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:50:07,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:07,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:50:07,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 18:50:11,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 18:50:14,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:17,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:50:17,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 18:50:17,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:18,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:19,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=456900.0, ans=0.07 2023-09-29 18:50:21,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:50:21,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:23,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:50:25,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:25,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 18:50:27,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 18:50:28,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 18:50:29,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=456900.0, ans=0.125 2023-09-29 18:50:31,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:50:31,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:33,040 INFO [train.py:1039] (2/4) Epoch 13, batch 4800, loss[loss=0.1968, simple_loss=0.277, pruned_loss=0.05828, over 24484.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2671, pruned_loss=0.0611, over 4714922.58 frames. ], batch size: 69, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:50:33,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 18:50:33,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=456966.6666666667, ans=0.125 2023-09-29 18:50:40,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:40,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:41,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=456966.6666666667, ans=0.125 2023-09-29 18:50:43,547 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.965e+02 2.180e+02 2.463e+02 4.053e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 18:50:47,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:50:48,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:49,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:50,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 18:50:50,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:50,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=457033.3333333333, ans=0.125 2023-09-29 18:50:51,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:50:53,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:50:56,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=457033.3333333333, ans=0.1 2023-09-29 18:50:58,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:59,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:50:59,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:50:59,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:02,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:03,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:06,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:08,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:08,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:51:09,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:51:12,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:12,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 18:51:14,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 18:51:14,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:15,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:51:15,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:51:15,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:15,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:51:20,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:51:20,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:25,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:28,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:30,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=457166.6666666667, ans=0.125 2023-09-29 18:51:31,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:36,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 18:51:36,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:36,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:38,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:51:39,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:42,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:44,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:51:44,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:46,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:51:46,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:51:47,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:51:50,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:51,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:51,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:53,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 18:51:54,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 18:51:54,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:54,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:56,762 INFO [train.py:1039] (2/4) Epoch 13, batch 4850, loss[loss=0.1733, simple_loss=0.2518, pruned_loss=0.04737, over 24693.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2666, pruned_loss=0.06048, over 4720351.58 frames. ], batch size: 65, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:51:56,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:51:56,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:58,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=457300.0, ans=0.04949747468305833 2023-09-29 18:52:01,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:52:03,769 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.10 vs. limit=15.0 2023-09-29 18:52:07,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 18:52:08,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:13,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:14,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:52:14,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:20,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:21,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:52:23,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:52:23,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 18:52:27,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:52:29,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:52:29,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:52:30,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:52:30,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 18:52:35,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:35,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:39,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:39,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 18:52:39,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 18:52:40,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:52:49,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:52:49,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 18:52:50,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:52:50,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:52:52,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:52:54,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 18:52:54,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:56,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 18:52:56,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:58,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:52:58,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 18:53:07,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:14,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:53:14,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:18,648 INFO [train.py:1039] (2/4) Epoch 13, batch 4900, loss[loss=0.1945, simple_loss=0.2678, pruned_loss=0.06055, over 23757.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2653, pruned_loss=0.06044, over 4709867.20 frames. ], batch size: 85, lr: 7.85e-03, grad_scale: 16.0 2023-09-29 18:53:21,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=457633.3333333333, ans=0.1 2023-09-29 18:53:22,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 18:53:22,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:53:27,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:29,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:29,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:53:32,646 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 2.076e+02 2.390e+02 2.815e+02 4.365e+02, threshold=4.780e+02, percent-clipped=1.0 2023-09-29 18:53:32,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 18:53:38,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 18:53:39,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=457700.0, ans=0.125 2023-09-29 18:53:39,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.25 vs. limit=15.0 2023-09-29 18:53:43,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 18:53:43,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 18:53:45,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:45,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:45,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:53:46,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:46,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:53:46,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 18:53:50,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 18:53:51,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:53:52,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:53:53,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:54,260 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.69 vs. limit=22.5 2023-09-29 18:53:55,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:53:56,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:58,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:58,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 18:54:00,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:54:00,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:54:00,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 18:54:00,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 18:54:06,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 18:54:06,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:54:08,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:08,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:54:09,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:09,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:54:09,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:54:11,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 18:54:11,808 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.40 vs. limit=6.0 2023-09-29 18:54:14,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:15,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:54:17,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:54:20,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 18:54:21,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:54:23,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:54:23,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 18:54:23,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=457900.0, ans=0.0 2023-09-29 18:54:27,420 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:54:31,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:33,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:54:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 18:54:35,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:35,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:54:37,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:40,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:54:40,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:54:40,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:41,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:54:41,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:54:42,626 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.85 vs. limit=12.0 2023-09-29 18:54:43,224 INFO [train.py:1039] (2/4) Epoch 13, batch 4950, loss[loss=0.1906, simple_loss=0.255, pruned_loss=0.06311, over 23849.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2641, pruned_loss=0.05974, over 4718332.62 frames. ], batch size: 212, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:54:46,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:54:46,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:49,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 18:54:50,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 18:54:50,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:54:52,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 18:54:52,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:52,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:52,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:54:52,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:54:55,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:57,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:54:57,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:54:58,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:54:58,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=458033.3333333333, ans=0.125 2023-09-29 18:54:59,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=458033.3333333333, ans=0.0 2023-09-29 18:54:59,293 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.94 vs. limit=15.0 2023-09-29 18:55:02,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:04,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:55:08,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:55:13,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:13,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:55:15,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:16,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:19,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:55:19,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 18:55:20,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=458100.0, ans=0.0 2023-09-29 18:55:21,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 18:55:23,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:26,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:55:26,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:55:27,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:55:27,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:55:29,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:55:29,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=458100.0, ans=0.125 2023-09-29 18:55:30,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:33,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:55:34,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:55:35,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:35,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:38,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 18:55:38,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:55:41,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:55:44,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:55:44,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=458166.6666666667, ans=0.125 2023-09-29 18:55:45,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:55:45,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:55:45,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:47,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:55:47,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:55:48,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:55:50,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:55:51,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:51,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 18:55:56,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:01,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 18:56:02,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:56:02,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=458233.3333333333, ans=0.125 2023-09-29 18:56:05,432 INFO [train.py:1039] (2/4) Epoch 13, batch 5000, loss[loss=0.187, simple_loss=0.273, pruned_loss=0.05049, over 24531.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2638, pruned_loss=0.05946, over 4721753.97 frames. ], batch size: 71, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:56:07,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:07,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:09,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 18:56:10,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 18:56:13,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:56:14,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 18:56:14,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:56:14,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:56:16,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 18:56:16,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:17,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:19,075 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.946e+02 2.294e+02 2.903e+02 4.132e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 18:56:19,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 18:56:19,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:19,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:20,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 18:56:20,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 18:56:21,547 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.21 vs. limit=22.5 2023-09-29 18:56:22,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:56:23,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 18:56:23,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:56:23,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:25,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:56:25,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 18:56:25,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 18:56:27,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 18:56:28,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:28,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:30,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 18:56:30,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:31,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:31,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:33,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:56:33,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=458366.6666666667, ans=0.2 2023-09-29 18:56:36,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 18:56:36,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:56:36,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=458433.3333333333, ans=0.1 2023-09-29 18:56:39,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:56:43,392 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 18:56:46,429 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.02 vs. limit=22.5 2023-09-29 18:56:47,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:49,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:49,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:56:52,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 18:56:52,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:53,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:53,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:55,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:56:56,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:56:59,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:57:01,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:07,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 18:57:11,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:23,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:57:23,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:23,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:57:25,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:25,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:57:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:57:25,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:28,277 INFO [train.py:1039] (2/4) Epoch 13, batch 5050, loss[loss=0.197, simple_loss=0.283, pruned_loss=0.05546, over 24030.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2644, pruned_loss=0.06006, over 4693587.06 frames. ], batch size: 80, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:57:29,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:29,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 18:57:31,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:57:32,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:34,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:57:34,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 18:57:36,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:38,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:57:39,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:57:40,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=458633.3333333333, ans=0.0 2023-09-29 18:57:41,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:57:42,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:57:45,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.73 vs. limit=15.0 2023-09-29 18:57:46,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=458700.0, ans=0.2 2023-09-29 18:57:49,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=458700.0, ans=0.0 2023-09-29 18:57:54,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 18:57:54,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:57:54,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:57:55,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 18:57:56,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=458700.0, ans=0.1 2023-09-29 18:57:57,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:57:58,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:57:58,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:58,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:57:58,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 18:58:00,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 18:58:01,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:03,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:05,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=458766.6666666667, ans=6.0 2023-09-29 18:58:07,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:07,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 18:58:10,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:12,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 18:58:12,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:58:12,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=458766.6666666667, ans=0.0 2023-09-29 18:58:13,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.10 vs. limit=22.5 2023-09-29 18:58:13,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:58:14,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:15,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:58:17,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:58:20,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:58:20,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:20,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:58:21,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:58:21,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 18:58:23,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:58:26,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:58:32,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:33,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 18:58:33,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:58:33,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:58:33,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:35,788 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 18:58:40,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:40,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 18:58:40,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:43,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:43,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:43,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=458900.0, ans=0.0 2023-09-29 18:58:44,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 18:58:46,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 18:58:48,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:58:48,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:58:49,635 INFO [train.py:1039] (2/4) Epoch 13, batch 5100, loss[loss=0.1713, simple_loss=0.245, pruned_loss=0.04878, over 24320.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2655, pruned_loss=0.06093, over 4688701.34 frames. ], batch size: 56, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:58:49,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:58:51,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=458966.6666666667, ans=0.125 2023-09-29 18:58:52,933 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 18:58:54,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:57,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 18:58:57,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 18:58:59,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:00,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:59:01,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=458966.6666666667, ans=0.125 2023-09-29 18:59:02,846 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.978e+02 2.231e+02 2.583e+02 5.581e+02, threshold=4.463e+02, percent-clipped=1.0 2023-09-29 18:59:03,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:59:04,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 18:59:04,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 18:59:09,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:59:11,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:59:15,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:17,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=459033.3333333333, ans=0.125 2023-09-29 18:59:19,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 18:59:19,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:22,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:59:22,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:59:24,782 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.78 vs. limit=22.5 2023-09-29 18:59:25,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:25,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:25,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=459100.0, ans=0.0 2023-09-29 18:59:26,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 18:59:28,611 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 18:59:28,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:30,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 18:59:30,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 18:59:34,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:42,746 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:59:43,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:59:46,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=459166.6666666667, ans=0.125 2023-09-29 18:59:47,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 18:59:47,400 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 18:59:47,413 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 18:59:48,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 18:59:48,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:53,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 18:59:56,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 18:59:59,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:59:59,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:00:02,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 19:00:04,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:00:04,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 19:00:10,217 INFO [train.py:1039] (2/4) Epoch 13, batch 5150, loss[loss=0.2096, simple_loss=0.2825, pruned_loss=0.06832, over 23912.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2659, pruned_loss=0.06145, over 4692120.30 frames. ], batch size: 86, lr: 7.83e-03, grad_scale: 8.0 2023-09-29 19:00:10,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:00:10,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:00:10,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:00:11,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:00:11,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:00:11,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=459300.0, ans=0.125 2023-09-29 19:00:12,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:00:13,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=459300.0, ans=0.0 2023-09-29 19:00:14,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 19:00:14,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 19:00:16,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 19:00:16,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:00:16,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 19:00:17,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:17,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:00:20,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:23,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:27,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:00:27,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 19:00:29,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:30,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:00:32,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:00:32,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:32,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:32,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:00:32,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:00:33,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 19:00:34,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:00:34,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:00:37,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:00:38,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 19:00:39,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=459366.6666666667, ans=0.0 2023-09-29 19:00:40,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:00:47,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:00:49,300 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=15.0 2023-09-29 19:00:49,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 19:00:50,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=459433.3333333333, ans=0.1 2023-09-29 19:00:51,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:53,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=459433.3333333333, ans=0.2 2023-09-29 19:00:59,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:59,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:04,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:05,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:07,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 19:01:09,209 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=459500.0, ans=0.0 2023-09-29 19:01:12,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:01:12,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=459500.0, ans=0.1 2023-09-29 19:01:13,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:01:13,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:01:16,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:18,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:18,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 19:01:22,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:25,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:01:28,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:01:28,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:01:30,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:01:30,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:01:30,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:01:32,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:01:33,938 INFO [train.py:1039] (2/4) Epoch 13, batch 5200, loss[loss=0.2071, simple_loss=0.2817, pruned_loss=0.06624, over 23408.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2667, pruned_loss=0.0614, over 4706906.79 frames. ], batch size: 93, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:01:35,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:01:36,389 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.22 vs. limit=22.5 2023-09-29 19:01:37,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:01:40,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:43,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 19:01:46,060 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 2.009e+02 2.232e+02 2.701e+02 3.997e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 19:01:46,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:01:46,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:46,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=459633.3333333333, ans=0.0 2023-09-29 19:01:49,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:50,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:01:51,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:53,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 19:01:53,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=459700.0, ans=0.125 2023-09-29 19:01:55,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:01:56,760 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.71 vs. limit=10.0 2023-09-29 19:01:57,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:59,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 19:02:00,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:02:02,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:02:02,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 19:02:03,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 19:02:05,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 19:02:05,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:05,608 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 19:02:05,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:02:09,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:10,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:02:10,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 19:02:10,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:02:12,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:16,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 19:02:16,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 19:02:18,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 19:02:22,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.33 vs. limit=15.0 2023-09-29 19:02:23,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 19:02:23,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:02:30,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:02:30,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:32,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.07 vs. limit=22.5 2023-09-29 19:02:33,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 19:02:33,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:34,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:02:34,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:34,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:02:39,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:40,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:02:42,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=459900.0, ans=0.125 2023-09-29 19:02:44,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:44,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:02:44,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:45,647 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.93 vs. limit=12.0 2023-09-29 19:02:49,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:50,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 19:02:50,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:52,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:02:52,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:54,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:02:54,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=459966.6666666667, ans=0.125 2023-09-29 19:02:55,509 INFO [train.py:1039] (2/4) Epoch 13, batch 5250, loss[loss=0.1659, simple_loss=0.2433, pruned_loss=0.04421, over 24614.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2654, pruned_loss=0.06112, over 4694896.24 frames. ], batch size: 60, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:02:55,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:02:57,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:03:01,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:01,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:03:03,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:03:08,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=459966.6666666667, ans=0.0 2023-09-29 19:03:10,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:03:11,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:03:14,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:03:14,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:03:18,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 19:03:18,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:19,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:03:39,972 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:03:40,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=460100.0, ans=0.0 2023-09-29 19:03:51,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=460166.6666666667, ans=0.0 2023-09-29 19:03:56,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=460233.3333333333, ans=0.2 2023-09-29 19:04:11,484 INFO [train.py:1039] (2/4) Epoch 13, batch 5300, loss[loss=0.2098, simple_loss=0.2827, pruned_loss=0.0685, over 24635.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2642, pruned_loss=0.06067, over 4692691.38 frames. ], batch size: 65, lr: 7.82e-03, grad_scale: 16.0 2023-09-29 19:04:18,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=460300.0, ans=0.125 2023-09-29 19:04:22,459 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.900e+02 2.152e+02 2.840e+02 4.256e+02, threshold=4.304e+02, percent-clipped=0.0 2023-09-29 19:04:26,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:04:26,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 19:04:26,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 19:04:26,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:26,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:26,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:27,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:27,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:27,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:04:27,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:27,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:04:27,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:04:27,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 19:04:28,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 19:04:28,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 19:04:28,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:04:28,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 19:04:28,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 19:04:29,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:29,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:29,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:29,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:29,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:04:30,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:30,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:30,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:30,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:30,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:30,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:04:30,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:30,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:04:31,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 19:04:31,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:32,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:32,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 19:04:32,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 19:04:32,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:04:32,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:04:32,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 19:04:33,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 19:04:33,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:33,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:04:34,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:34,278 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 19:04:34,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 19:04:34,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:04:34,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:34,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 19:04:34,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 19:04:34,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 19:04:35,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:43,522 INFO [train.py:1039] (2/4) Epoch 14, batch 0, loss[loss=0.1795, simple_loss=0.2643, pruned_loss=0.04737, over 24650.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2643, pruned_loss=0.04737, over 24650.00 frames. ], batch size: 68, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:04:43,523 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 19:04:58,062 INFO [train.py:1071] (2/4) Epoch 14, validation: loss=0.2893, simple_loss=0.2709, pruned_loss=0.1538, over 1125622.00 frames. 2023-09-29 19:04:58,063 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-29 19:05:00,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 19:05:01,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:05:03,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:05:04,929 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:05:09,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:09,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:05:10,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:10,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 19:05:12,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=460446.6666666667, ans=0.0 2023-09-29 19:05:13,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 19:05:17,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:18,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:20,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=460446.6666666667, ans=0.125 2023-09-29 19:05:22,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:24,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:24,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:05:24,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:26,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 19:05:28,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:37,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:05:37,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:40,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 19:05:45,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:05:45,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:05:46,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:05:50,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:05:55,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:00,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 19:06:05,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 19:06:05,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:05,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:07,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:06:07,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:10,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 19:06:13,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:14,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:17,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:06:20,154 INFO [train.py:1039] (2/4) Epoch 14, batch 50, loss[loss=0.2058, simple_loss=0.2689, pruned_loss=0.07133, over 22768.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2652, pruned_loss=0.0598, over 1064158.96 frames. ], batch size: 322, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:06:20,395 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 19:06:21,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:06:24,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:26,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:26,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 19:06:28,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:06:28,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:06:29,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=460713.3333333333, ans=0.0 2023-09-29 19:06:31,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:33,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:35,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:38,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 19:06:38,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:38,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=460780.0, ans=0.125 2023-09-29 19:06:45,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:06:47,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=460780.0, ans=0.0 2023-09-29 19:06:48,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 19:06:50,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 19:06:50,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:06:52,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:06:52,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:52,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:53,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:06:55,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:06:55,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:07:00,420 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.52 vs. limit=15.0 2023-09-29 19:07:01,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:04,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:04,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:07:06,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 19:07:09,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:07:11,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:07:11,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 19:07:11,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:12,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 19:07:20,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:07:20,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:20,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:22,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:22,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:25,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 19:07:26,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 19:07:26,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:26,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:27,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=460980.0, ans=0.5 2023-09-29 19:07:29,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:07:29,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:30,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 19:07:31,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 19:07:33,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 19:07:35,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:35,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:07:35,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 19:07:35,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 19:07:37,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:38,445 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.985e+02 2.220e+02 2.670e+02 4.594e+02, threshold=4.441e+02, percent-clipped=1.0 2023-09-29 19:07:38,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:40,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:07:40,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:07:43,617 INFO [train.py:1039] (2/4) Epoch 14, batch 100, loss[loss=0.204, simple_loss=0.2656, pruned_loss=0.07124, over 23763.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2667, pruned_loss=0.06129, over 1868095.18 frames. ], batch size: 212, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:07:43,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:07:45,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:07:50,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:07:52,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 19:07:52,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:57,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:07:58,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:07:58,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:58,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:58,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:08:00,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 19:08:00,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:08:01,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:01,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:01,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:08:04,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 19:08:07,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:07,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:08,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:08:10,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:08:10,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=461113.3333333333, ans=0.1 2023-09-29 19:08:14,002 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 19:08:15,217 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 19:08:16,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:08:16,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:08:19,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:08:22,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:24,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:30,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:31,726 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 19:08:32,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=461246.6666666667, ans=0.125 2023-09-29 19:08:33,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:08:37,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:08:39,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:08:41,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:44,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:47,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:08:48,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:08:50,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=461313.3333333333, ans=0.125 2023-09-29 19:08:51,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:53,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:54,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:54,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:08:55,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:56,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 19:08:56,485 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 19:08:57,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:58,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:09:00,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:00,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:00,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 19:09:00,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:09:01,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:09:01,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:03,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:05,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:05,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:09:05,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:09:06,634 INFO [train.py:1039] (2/4) Epoch 14, batch 150, loss[loss=0.2537, simple_loss=0.3077, pruned_loss=0.09984, over 19585.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.267, pruned_loss=0.06193, over 2488251.49 frames. ], batch size: 388, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:09:08,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:10,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:09:10,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:11,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:13,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=461380.0, ans=0.5 2023-09-29 19:09:14,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:14,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=461380.0, ans=0.0 2023-09-29 19:09:16,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:19,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:09:19,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:24,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 19:09:24,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 19:09:24,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 19:09:27,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:09:27,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:09:29,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:09:29,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:09:29,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:30,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:31,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:32,665 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 19:09:34,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:40,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:40,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=461513.3333333333, ans=0.2 2023-09-29 19:09:43,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:09:44,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 19:09:47,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:09:47,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:47,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:09:51,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:09:53,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:54,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:09:56,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:56,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 19:10:03,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:03,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:04,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:10:04,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:10:07,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:08,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 19:10:10,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=461646.6666666667, ans=0.025 2023-09-29 19:10:12,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:10:12,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=461646.6666666667, ans=0.2 2023-09-29 19:10:13,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:10:13,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:15,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:10:15,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 19:10:15,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:10:15,712 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 19:10:18,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=461646.6666666667, ans=0.125 2023-09-29 19:10:21,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:21,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.52 vs. limit=15.0 2023-09-29 19:10:23,771 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.853e+02 2.115e+02 2.469e+02 4.470e+02, threshold=4.229e+02, percent-clipped=1.0 2023-09-29 19:10:26,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:10:26,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:10:29,879 INFO [train.py:1039] (2/4) Epoch 14, batch 200, loss[loss=0.1945, simple_loss=0.2691, pruned_loss=0.05988, over 24374.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2676, pruned_loss=0.06321, over 2983925.40 frames. ], batch size: 77, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:10:30,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 19:10:30,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:30,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:34,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 19:10:36,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:10:37,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:38,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:43,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:10:43,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:43,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:51,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=461780.0, ans=0.125 2023-09-29 19:10:53,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=461780.0, ans=0.125 2023-09-29 19:11:01,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:11:03,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:11:04,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:11:05,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:11:05,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:11:05,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:11:06,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:08,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:11:08,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:09,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:11,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 19:11:12,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:11:12,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:16,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:11:23,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:33,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:35,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:11:38,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.79 vs. limit=15.0 2023-09-29 19:11:42,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:45,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 19:11:46,060 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.07 vs. limit=6.0 2023-09-29 19:11:46,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:46,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:11:47,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:47,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:11:49,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 19:11:49,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:11:49,531 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 19:11:50,902 INFO [train.py:1039] (2/4) Epoch 14, batch 250, loss[loss=0.1866, simple_loss=0.2568, pruned_loss=0.05818, over 23571.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2665, pruned_loss=0.06269, over 3367121.74 frames. ], batch size: 120, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:11:52,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:54,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:11:56,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:56,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:57,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:11:57,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:59,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:12:03,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:12:17,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:19,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:12:20,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:12:26,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:12:28,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:12:28,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:12:28,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:30,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:12:30,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:12:30,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:32,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:12:35,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 19:12:35,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:36,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:12:36,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:12:36,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:12:38,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:12:38,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:12:38,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:12:42,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:42,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:12:44,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:12:45,061 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=15.0 2023-09-29 19:12:45,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=462246.6666666667, ans=0.125 2023-09-29 19:12:47,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:12:51,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:55,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:12:59,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=462313.3333333333, ans=0.1 2023-09-29 19:13:00,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:01,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:13:05,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 19:13:07,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:07,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:13:10,063 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.958e+02 2.110e+02 2.520e+02 4.183e+02, threshold=4.220e+02, percent-clipped=0.0 2023-09-29 19:13:10,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 19:13:10,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:13:11,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:13:11,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 19:13:13,270 INFO [train.py:1039] (2/4) Epoch 14, batch 300, loss[loss=0.2076, simple_loss=0.2663, pruned_loss=0.07442, over 23771.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2641, pruned_loss=0.06212, over 3654848.16 frames. ], batch size: 164, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:13:14,268 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.53 vs. limit=10.0 2023-09-29 19:13:15,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=462380.0, ans=0.0 2023-09-29 19:13:19,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:13:19,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:13:22,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:13:24,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 19:13:25,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=462380.0, ans=0.1 2023-09-29 19:13:26,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:27,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:13:27,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 19:13:27,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:31,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:13:37,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:13:37,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 19:13:42,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 19:13:43,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:45,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:47,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:47,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 19:13:47,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:13:50,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:13:53,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:13:54,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:58,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:13:58,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 19:13:58,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:14:02,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:04,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 19:14:05,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:06,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=462580.0, ans=0.0 2023-09-29 19:14:08,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:14:12,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:14:12,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 19:14:18,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:18,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:14:21,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:21,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:14:22,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 19:14:23,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:14:23,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:14:26,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 19:14:27,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:27,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:29,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:29,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:30,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:34,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=462646.6666666667, ans=0.125 2023-09-29 19:14:35,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:37,260 INFO [train.py:1039] (2/4) Epoch 14, batch 350, loss[loss=0.1806, simple_loss=0.2472, pruned_loss=0.05704, over 23542.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2627, pruned_loss=0.06112, over 3886331.57 frames. ], batch size: 285, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:14:37,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:14:39,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:46,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:50,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:50,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 19:14:55,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:55,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 19:14:58,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:58,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 19:15:00,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:02,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 19:15:03,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:15:07,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:07,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:15:09,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:09,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:10,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:15:13,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:15:13,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:22,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:15:22,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:15:23,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:15:24,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:30,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 19:15:30,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:33,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=462913.3333333333, ans=0.1 2023-09-29 19:15:35,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:35,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:35,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:15:38,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 19:15:39,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=462913.3333333333, ans=0.2 2023-09-29 19:15:41,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:43,834 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 19:15:44,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=462980.0, ans=0.125 2023-09-29 19:15:45,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 19:15:46,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:48,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:48,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 19:15:50,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:51,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:15:53,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:56,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=462980.0, ans=0.09899494936611666 2023-09-29 19:15:56,063 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:15:57,095 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.848e+02 2.063e+02 2.317e+02 4.440e+02, threshold=4.125e+02, percent-clipped=1.0 2023-09-29 19:15:57,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:57,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:58,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:16:00,122 INFO [train.py:1039] (2/4) Epoch 14, batch 400, loss[loss=0.1979, simple_loss=0.2617, pruned_loss=0.06702, over 22770.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2631, pruned_loss=0.06082, over 4062748.06 frames. ], batch size: 322, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:16:02,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:16:05,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:16:05,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 19:16:05,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:06,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:08,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:16:08,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:10,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:13,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:17,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 19:16:18,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 19:16:18,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:20,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 19:16:20,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:23,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:16:24,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:24,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 19:16:26,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:16:26,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:26,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:28,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:31,381 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 19:16:31,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 19:16:34,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:35,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=463180.0, ans=0.04949747468305833 2023-09-29 19:16:36,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:37,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 19:16:39,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 19:16:41,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=463180.0, ans=0.05 2023-09-29 19:16:42,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:16:45,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:16:52,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 19:16:55,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:16:57,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 19:16:58,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=463246.6666666667, ans=0.0 2023-09-29 19:17:00,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:17:01,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:17:01,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 19:17:05,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:17:07,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:17:08,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:17:11,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:13,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 19:17:14,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:17:15,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 19:17:17,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:17:18,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:17:18,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=463313.3333333333, ans=0.0 2023-09-29 19:17:19,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=463380.0, ans=0.1 2023-09-29 19:17:20,951 INFO [train.py:1039] (2/4) Epoch 14, batch 450, loss[loss=0.1765, simple_loss=0.2625, pruned_loss=0.04525, over 24455.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2641, pruned_loss=0.06071, over 4200573.20 frames. ], batch size: 63, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:17:21,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 19:17:25,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:17:25,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:17:25,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:17:26,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 19:17:26,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:17:28,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:17:28,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:17:28,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 19:17:29,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:17:30,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:17:31,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:17:33,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=463380.0, ans=0.0 2023-09-29 19:17:35,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=463380.0, ans=0.125 2023-09-29 19:17:42,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:42,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:17:42,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=463446.6666666667, ans=0.05 2023-09-29 19:17:45,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 19:17:46,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 19:17:48,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:17:50,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:50,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=463446.6666666667, ans=0.2 2023-09-29 19:17:52,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:17:57,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:17:57,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:18:00,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 19:18:00,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 19:18:01,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 19:18:01,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:03,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:03,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=463513.3333333333, ans=0.125 2023-09-29 19:18:04,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:18:07,110 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 19:18:07,123 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 19:18:07,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:18:10,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:18:10,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:18:15,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:18:15,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:18:16,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:18:18,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 19:18:21,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:24,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:18:24,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:18:25,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 19:18:30,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:18:32,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 19:18:33,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 19:18:34,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:39,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:18:40,721 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.869e+02 2.102e+02 2.617e+02 3.390e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-29 19:18:40,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:18:43,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:18:43,168 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 19:18:44,451 INFO [train.py:1039] (2/4) Epoch 14, batch 500, loss[loss=0.1938, simple_loss=0.2596, pruned_loss=0.06399, over 23888.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2654, pruned_loss=0.06078, over 4314344.24 frames. ], batch size: 195, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:18:45,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten.whitening_limit, batch_count=463713.3333333333, ans=22.5 2023-09-29 19:18:46,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:48,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:18:48,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:48,437 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 19:18:51,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 19:18:51,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:54,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=463713.3333333333, ans=0.125 2023-09-29 19:18:55,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:18:59,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:19:00,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:19:03,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:19:03,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:19:05,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:15,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:16,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:19:16,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:19:17,867 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.72 vs. limit=15.0 2023-09-29 19:19:18,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:18,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 19:19:18,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:19:22,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:19:23,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:19:23,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:19:25,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:26,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 19:19:29,642 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 19:19:32,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:34,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:36,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:19:39,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 19:19:42,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:19:44,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:47,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:19:50,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:54,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=463980.0, ans=0.125 2023-09-29 19:19:57,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:57,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=463980.0, ans=0.2 2023-09-29 19:19:59,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 19:19:59,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:59,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:20:04,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 19:20:05,327 INFO [train.py:1039] (2/4) Epoch 14, batch 550, loss[loss=0.218, simple_loss=0.2791, pruned_loss=0.07847, over 23440.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2669, pruned_loss=0.06138, over 4414487.66 frames. ], batch size: 285, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:20:05,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:20:07,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:12,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=464046.6666666667, ans=0.0 2023-09-29 19:20:13,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 19:20:14,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 19:20:14,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:14,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 19:20:15,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:20:17,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:17,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:19,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:19,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:20:20,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:20:21,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=464113.3333333333, ans=0.125 2023-09-29 19:20:22,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:23,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 19:20:23,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:20:28,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:28,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:30,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:20:30,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:37,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 19:20:38,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 19:20:40,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:20:45,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:20:45,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:46,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:20:50,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:50,466 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 19:20:52,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:52,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:20:52,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=464180.0, ans=0.0 2023-09-29 19:20:55,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:57,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:20:57,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:20:57,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:58,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 19:21:00,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 19:21:01,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:01,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:21:01,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:01,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:21:05,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:21:07,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:21:10,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:21:10,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:12,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 19:21:13,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:21:15,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:15,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:21:16,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:18,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:21:18,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:21:25,256 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.884e+02 2.077e+02 2.403e+02 3.738e+02, threshold=4.154e+02, percent-clipped=0.0 2023-09-29 19:21:25,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 19:21:28,503 INFO [train.py:1039] (2/4) Epoch 14, batch 600, loss[loss=0.2013, simple_loss=0.2722, pruned_loss=0.06524, over 23361.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.267, pruned_loss=0.0614, over 4483903.98 frames. ], batch size: 119, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:21:30,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 19:21:31,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:21:31,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:21:31,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:33,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=464380.0, ans=0.125 2023-09-29 19:21:40,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:21:40,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:21:42,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 19:21:45,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:21:46,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:21:49,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:51,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 19:21:51,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:58,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 19:22:01,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:22:01,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:02,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:22:02,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=464513.3333333333, ans=0.125 2023-09-29 19:22:08,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:22:08,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:22:10,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:10,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=464513.3333333333, ans=0.1 2023-09-29 19:22:14,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys.whitening_limit, batch_count=464513.3333333333, ans=6.0 2023-09-29 19:22:16,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:22:21,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:21,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:22:21,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:23,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=464580.0, ans=0.5 2023-09-29 19:22:33,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 19:22:37,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:22:37,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:22:42,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 19:22:42,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:22:44,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 19:22:44,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:22:46,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:22:46,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=464646.6666666667, ans=0.0 2023-09-29 19:22:46,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=464646.6666666667, ans=0.1 2023-09-29 19:22:50,973 INFO [train.py:1039] (2/4) Epoch 14, batch 650, loss[loss=0.1853, simple_loss=0.243, pruned_loss=0.06383, over 23553.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2654, pruned_loss=0.06104, over 4526965.12 frames. ], batch size: 256, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:22:51,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:22:53,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:22:56,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:22:56,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:22:58,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:22:59,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 19:23:00,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:23:04,377 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.29 vs. limit=22.5 2023-09-29 19:23:06,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:23:06,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:09,382 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.94 vs. limit=10.0 2023-09-29 19:23:10,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:13,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 19:23:15,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:15,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:20,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:20,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:23:23,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:23,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:23,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:23:25,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:26,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:23:29,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:23:29,589 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 19:23:29,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:29,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:33,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:34,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:34,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:23:34,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:23:36,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 19:23:36,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:23:38,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:23:39,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:23:39,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:41,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:23:41,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 19:23:44,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 19:23:44,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:44,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:44,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:23:44,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:48,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:54,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:54,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:56,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:24:00,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:00,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:24:02,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:04,562 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.84 vs. limit=15.0 2023-09-29 19:24:09,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:24:09,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:09,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:10,965 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.056e+02 2.592e+02 3.186e+02 5.109e+02, threshold=5.184e+02, percent-clipped=6.0 2023-09-29 19:24:11,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:14,009 INFO [train.py:1039] (2/4) Epoch 14, batch 700, loss[loss=0.1887, simple_loss=0.2603, pruned_loss=0.05856, over 23352.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2639, pruned_loss=0.06052, over 4566956.28 frames. ], batch size: 105, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:24:17,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 19:24:17,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 19:24:20,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 19:24:22,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:23,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:24:25,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 19:24:30,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:33,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:24:35,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:38,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:24:38,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:24:38,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=465113.3333333333, ans=0.125 2023-09-29 19:24:40,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:43,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:24:43,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:24:44,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=465113.3333333333, ans=0.09899494936611666 2023-09-29 19:24:47,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 19:24:50,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 19:24:53,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:24:53,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:24:55,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:25:00,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:25:00,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 19:25:06,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:06,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:25:06,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 19:25:10,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:25:12,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:15,201 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.63 vs. limit=15.0 2023-09-29 19:25:17,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:25:17,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=465246.6666666667, ans=0.1 2023-09-29 19:25:24,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:25:24,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 19:25:27,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 19:25:27,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 19:25:29,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=465313.3333333333, ans=0.125 2023-09-29 19:25:30,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:32,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:32,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:25:34,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:34,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 19:25:37,464 INFO [train.py:1039] (2/4) Epoch 14, batch 750, loss[loss=0.1837, simple_loss=0.265, pruned_loss=0.05118, over 24335.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2625, pruned_loss=0.05969, over 4588961.45 frames. ], batch size: 61, lr: 7.50e-03, grad_scale: 8.0 2023-09-29 19:25:39,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 19:25:39,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 19:25:39,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 19:25:42,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 19:25:42,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 19:25:42,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:25:44,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 19:25:45,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:45,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:25:47,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:49,611 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=465380.0, ans=0.125 2023-09-29 19:25:50,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:50,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:25:50,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:52,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:25:52,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:25:56,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:25:57,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:59,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:59,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 19:26:00,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:26:02,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:04,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:05,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:26:05,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 19:26:05,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:09,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 19:26:09,640 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 19:26:11,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 19:26:11,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:26:12,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:26:14,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:26:14,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=465513.3333333333, ans=0.0 2023-09-29 19:26:21,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:26:21,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:21,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:26:24,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:26:26,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:26:26,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 19:26:28,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:26:28,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 19:26:28,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=465580.0, ans=0.0 2023-09-29 19:26:29,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:26:31,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:26:32,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.39 vs. limit=15.0 2023-09-29 19:26:32,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 19:26:33,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:39,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:26:39,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=465580.0, ans=0.125 2023-09-29 19:26:40,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:26:41,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:26:44,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:26:47,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 19:26:47,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:26:49,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:51,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:52,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:54,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:54,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:26:59,101 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 2.067e+02 2.400e+02 2.919e+02 4.074e+02, threshold=4.801e+02, percent-clipped=0.0 2023-09-29 19:27:01,196 INFO [train.py:1039] (2/4) Epoch 14, batch 800, loss[loss=0.1748, simple_loss=0.2503, pruned_loss=0.04966, over 24343.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2631, pruned_loss=0.05998, over 4602680.50 frames. ], batch size: 61, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:27:02,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:02,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:04,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:27:04,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:05,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:06,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:07,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:10,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:12,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:27:16,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 19:27:16,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:18,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:18,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:27:18,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:21,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 19:27:21,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:21,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 19:27:24,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:25,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:27,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=465780.0, ans=0.125 2023-09-29 19:27:29,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:27:29,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:32,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:32,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:35,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:27:37,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:27:37,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 19:27:38,848 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 19:27:40,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 19:27:40,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:27:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:43,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:43,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:27:46,391 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 19:27:46,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 19:27:48,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:27:50,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:27:54,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:27:57,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:28:00,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 19:28:00,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:28:03,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 19:28:10,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:14,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:28:14,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 19:28:16,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:28:17,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:17,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 19:28:19,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:20,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:28:20,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:22,318 INFO [train.py:1039] (2/4) Epoch 14, batch 850, loss[loss=0.1877, simple_loss=0.2756, pruned_loss=0.04987, over 24672.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2641, pruned_loss=0.06032, over 4629885.56 frames. ], batch size: 73, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:28:22,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:28:23,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:28:26,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 19:28:26,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 19:28:26,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 19:28:26,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=466046.6666666667, ans=0.0 2023-09-29 19:28:27,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:27,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:28:30,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:30,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:30,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:28:37,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:37,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:28:39,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 19:28:41,804 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.64 vs. limit=6.0 2023-09-29 19:28:43,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 19:28:46,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:47,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 19:28:52,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 19:28:52,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 19:28:55,742 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 19:28:55,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:28:55,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:28:55,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:28:59,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 19:29:03,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:29:04,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:06,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:29:06,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:29:09,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:29:10,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:29:10,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 19:29:13,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=466246.6666666667, ans=0.1 2023-09-29 19:29:16,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:29:16,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:16,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:29:17,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:19,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:22,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:24,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:29:27,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:29:27,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:28,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:29:38,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:29:39,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:39,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 19:29:39,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:40,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:42,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 19:29:43,736 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.955e+02 2.156e+02 2.451e+02 4.149e+02, threshold=4.312e+02, percent-clipped=0.0 2023-09-29 19:29:45,189 INFO [train.py:1039] (2/4) Epoch 14, batch 900, loss[loss=0.1679, simple_loss=0.2461, pruned_loss=0.04486, over 24600.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2646, pruned_loss=0.05989, over 4659708.84 frames. ], batch size: 60, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:29:46,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:29:49,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:50,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 19:29:53,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:29:53,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 19:29:55,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:29:57,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:57,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:29:57,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:29:59,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:30:02,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=466446.6666666667, ans=0.125 2023-09-29 19:30:04,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=466446.6666666667, ans=0.0 2023-09-29 19:30:12,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:12,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:30:12,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:30:17,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:22,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 19:30:25,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:30:25,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=466513.3333333333, ans=0.0 2023-09-29 19:30:30,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:30:31,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:30:33,404 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 19:30:34,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 19:30:41,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:30:41,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:30:42,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:30:49,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:49,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:30:49,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=466646.6666666667, ans=0.0 2023-09-29 19:30:53,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 19:30:53,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:56,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 19:30:57,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:30:57,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:59,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:30:59,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:04,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 19:31:04,471 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 19:31:06,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:31:06,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 19:31:08,193 INFO [train.py:1039] (2/4) Epoch 14, batch 950, loss[loss=0.173, simple_loss=0.2519, pruned_loss=0.04705, over 16523.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2645, pruned_loss=0.05945, over 4675580.50 frames. ], batch size: 36, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:31:08,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:12,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 19:31:19,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:20,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:31:25,954 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 19:31:31,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:33,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:33,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:34,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:31:34,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 19:31:34,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:31:36,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:38,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 19:31:39,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:44,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:44,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:44,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:44,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 19:31:45,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=466846.6666666667, ans=0.0 2023-09-29 19:31:46,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:31:46,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:46,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=466846.6666666667, ans=0.125 2023-09-29 19:31:50,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:31:56,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:31:56,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:58,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 19:32:02,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:32:02,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:32:02,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=466913.3333333333, ans=0.0 2023-09-29 19:32:03,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:04,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:04,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:32:10,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 19:32:10,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:32:11,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:13,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:13,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 19:32:13,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:13,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:32:14,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 19:32:17,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=466980.0, ans=0.2 2023-09-29 19:32:18,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:32:23,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:28,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:30,246 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.850e+02 2.095e+02 2.342e+02 3.294e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 19:32:30,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 19:32:30,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 19:32:31,886 INFO [train.py:1039] (2/4) Epoch 14, batch 1000, loss[loss=0.2015, simple_loss=0.2806, pruned_loss=0.06123, over 23144.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2637, pruned_loss=0.05892, over 4679527.87 frames. ], batch size: 93, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:32:32,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:37,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 19:32:37,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:32:39,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=467046.6666666667, ans=0.125 2023-09-29 19:32:43,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:32:45,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 19:32:45,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 19:32:49,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:32:49,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:51,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:54,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 19:32:59,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 19:33:00,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 19:33:02,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:05,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 19:33:06,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 19:33:06,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 19:33:07,276 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.62 vs. limit=15.0 2023-09-29 19:33:08,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:10,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:18,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:18,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:33:18,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:19,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:19,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 19:33:19,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:21,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:33:21,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:22,441 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.60 vs. limit=15.0 2023-09-29 19:33:22,925 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 19:33:24,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 19:33:25,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=467246.6666666667, ans=0.125 2023-09-29 19:33:26,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 19:33:27,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=467246.6666666667, ans=0.125 2023-09-29 19:33:29,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 19:33:33,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:33:38,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:38,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:33:38,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:39,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:33:40,006 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=467313.3333333333, ans=0.125 2023-09-29 19:33:43,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 19:33:43,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:33:44,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 19:33:45,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 19:33:46,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:33:46,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:48,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:33:51,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:33:51,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:51,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=467313.3333333333, ans=0.2 2023-09-29 19:33:54,426 INFO [train.py:1039] (2/4) Epoch 14, batch 1050, loss[loss=0.1904, simple_loss=0.2716, pruned_loss=0.05454, over 24655.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2627, pruned_loss=0.05837, over 4689565.93 frames. ], batch size: 68, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:33:56,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:33:56,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:33:59,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:34:01,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:04,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:06,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:34:08,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:34:10,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:34:10,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:34:10,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:34:12,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:34:12,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 19:34:13,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:15,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 19:34:17,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:34:17,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 19:34:17,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:34:20,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=467446.6666666667, ans=0.025 2023-09-29 19:34:25,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:25,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:34:27,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:29,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 19:34:29,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 19:34:30,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:32,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 19:34:33,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=467513.3333333333, ans=0.0 2023-09-29 19:34:34,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 19:34:36,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:34:36,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=467513.3333333333, ans=0.05 2023-09-29 19:34:38,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:34:39,118 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.67 vs. limit=15.0 2023-09-29 19:34:41,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:34:44,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:34:45,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:34:47,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:34:49,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-09-29 19:34:51,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 19:34:53,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 19:34:53,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 19:34:54,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:34:55,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:34:56,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 19:35:01,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:35:03,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:35:03,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:03,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:04,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:08,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:08,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 19:35:09,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:09,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 19:35:09,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 19:35:11,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:35:11,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=467646.6666666667, ans=0.0 2023-09-29 19:35:15,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:35:17,490 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.954e+02 2.210e+02 2.556e+02 3.990e+02, threshold=4.421e+02, percent-clipped=0.0 2023-09-29 19:35:19,002 INFO [train.py:1039] (2/4) Epoch 14, batch 1100, loss[loss=0.1952, simple_loss=0.2603, pruned_loss=0.06509, over 23758.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2617, pruned_loss=0.05842, over 4686035.09 frames. ], batch size: 232, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:35:23,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:35:26,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:35:27,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=467713.3333333333, ans=0.125 2023-09-29 19:35:28,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:35:28,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:28,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 19:35:30,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:35:31,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:35:33,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:35:37,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:35:38,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 19:35:38,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:35:40,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:40,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:44,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:35:46,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:35:52,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:35:54,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 19:35:54,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=467846.6666666667, ans=0.1 2023-09-29 19:35:55,724 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 19:35:56,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.48 vs. limit=6.0 2023-09-29 19:35:57,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:36:01,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:36:03,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 19:36:03,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:36:03,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:36:03,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:36:03,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=467846.6666666667, ans=0.125 2023-09-29 19:36:04,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:04,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 19:36:12,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:36:12,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 19:36:14,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:36:15,101 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.36 vs. limit=10.0 2023-09-29 19:36:19,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:36:22,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 19:36:22,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:36:24,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:24,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=467980.0, ans=0.0 2023-09-29 19:36:28,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:28,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:29,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 19:36:29,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:36:31,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:31,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 19:36:31,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:36:31,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 19:36:34,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:36:34,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:36:35,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:36:39,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=468046.6666666667, ans=0.2 2023-09-29 19:36:40,201 INFO [train.py:1039] (2/4) Epoch 14, batch 1150, loss[loss=0.2024, simple_loss=0.2826, pruned_loss=0.06109, over 24554.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2627, pruned_loss=0.05912, over 4686621.58 frames. ], batch size: 71, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:36:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:45,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:36:46,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:48,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:36:48,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 19:36:48,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:36:51,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 19:36:52,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:52,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:36:58,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 19:37:02,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:06,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:37:07,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:08,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 19:37:08,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:37:08,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:37:11,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 19:37:11,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:13,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:37:17,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=468180.0, ans=0.04949747468305833 2023-09-29 19:37:23,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:31,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:31,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 19:37:31,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:33,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:37,501 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 19:37:38,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:46,573 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 19:37:50,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:37:51,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:37:51,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:37:51,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:37:55,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=468313.3333333333, ans=0.125 2023-09-29 19:37:56,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:37:58,556 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.87 vs. limit=15.0 2023-09-29 19:38:00,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=468313.3333333333, ans=0.125 2023-09-29 19:38:01,586 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.871e+02 2.319e+02 2.937e+02 5.340e+02, threshold=4.639e+02, percent-clipped=1.0 2023-09-29 19:38:01,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:38:01,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:38:02,000 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:38:03,178 INFO [train.py:1039] (2/4) Epoch 14, batch 1200, loss[loss=0.1856, simple_loss=0.2697, pruned_loss=0.0508, over 24446.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2639, pruned_loss=0.05946, over 4694293.22 frames. ], batch size: 69, lr: 7.48e-03, grad_scale: 32.0 2023-09-29 19:38:03,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:03,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:03,572 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:38:04,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:38:07,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=468380.0, ans=0.125 2023-09-29 19:38:08,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:38:10,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:38:11,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:38:13,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:14,986 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 19:38:16,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 19:38:19,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=468446.6666666667, ans=0.2 2023-09-29 19:38:20,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=468446.6666666667, ans=10.0 2023-09-29 19:38:21,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:38:22,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:38:26,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:27,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:38:27,934 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 19:38:28,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:31,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=468446.6666666667, ans=0.125 2023-09-29 19:38:32,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=468446.6666666667, ans=0.0 2023-09-29 19:38:37,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:38:37,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:38:37,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 19:38:39,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:38:44,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 19:38:47,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 19:38:47,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:49,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:49,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:38:50,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:38:53,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:53,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:38:53,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:38:55,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 19:38:56,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:38:56,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:38:56,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:39:00,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:00,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:02,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:39:03,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:39:06,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 19:39:10,660 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 19:39:14,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:17,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:39:18,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:39:21,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:23,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 19:39:25,443 INFO [train.py:1039] (2/4) Epoch 14, batch 1250, loss[loss=0.1773, simple_loss=0.2514, pruned_loss=0.05157, over 24330.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2647, pruned_loss=0.05957, over 4687883.02 frames. ], batch size: 61, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:39:28,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:39:29,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:30,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 19:39:33,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:39:34,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:39:39,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:39:39,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:40,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=468780.0, ans=0.125 2023-09-29 19:39:41,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:39:41,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:44,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:39:48,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 19:39:49,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:39:49,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:50,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:50,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:39:54,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:56,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:40:01,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 19:40:02,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:40:05,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:06,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 19:40:06,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:40:06,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=468846.6666666667, ans=0.125 2023-09-29 19:40:07,594 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 19:40:07,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:07,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:12,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:14,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=468913.3333333333, ans=0.125 2023-09-29 19:40:15,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:16,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.71 vs. limit=15.0 2023-09-29 19:40:17,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:40:19,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 19:40:19,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 19:40:19,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 19:40:22,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:24,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 19:40:24,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:29,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:40:29,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:40:32,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 19:40:32,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:40:34,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:40:34,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:40:34,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 19:40:38,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:40,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:40:40,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:40:45,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:40:45,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=469046.6666666667, ans=10.0 2023-09-29 19:40:46,415 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.815e+02 2.067e+02 2.359e+02 2.982e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-29 19:40:46,459 INFO [train.py:1039] (2/4) Epoch 14, batch 1300, loss[loss=0.2459, simple_loss=0.3004, pruned_loss=0.09575, over 19701.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2642, pruned_loss=0.059, over 4697276.53 frames. ], batch size: 388, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:40:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:48,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 19:40:54,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:56,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:40:58,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:40:58,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:41:00,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:41:01,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 19:41:05,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=469113.3333333333, ans=0.125 2023-09-29 19:41:06,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:41:08,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:41:09,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 19:41:10,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=469113.3333333333, ans=0.125 2023-09-29 19:41:14,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:41:18,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:19,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:22,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:41:24,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:24,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:41:25,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:41:27,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 19:41:30,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:41:31,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:41:34,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 19:41:35,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:41:37,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:41:40,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:41:40,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 19:41:40,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:40,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 19:41:43,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:47,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:47,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:41:50,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 19:41:50,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 19:41:51,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=469313.3333333333, ans=0.125 2023-09-29 19:41:52,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 19:41:54,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=469313.3333333333, ans=0.125 2023-09-29 19:41:57,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:42:00,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 19:42:02,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:09,276 INFO [train.py:1039] (2/4) Epoch 14, batch 1350, loss[loss=0.1769, simple_loss=0.2614, pruned_loss=0.04624, over 24475.00 frames. ], tot_loss[loss=0.1913, simple_loss=0.2641, pruned_loss=0.05923, over 4702548.18 frames. ], batch size: 66, lr: 7.47e-03, grad_scale: 8.0 2023-09-29 19:42:09,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:11,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 19:42:14,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:14,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=469380.0, ans=10.0 2023-09-29 19:42:16,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=469380.0, ans=0.0 2023-09-29 19:42:17,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:20,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:20,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:22,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:42:22,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=469380.0, ans=0.2 2023-09-29 19:42:22,739 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:24,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:27,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:29,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 19:42:30,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:42:30,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:42:35,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 19:42:35,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:42:36,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:42:36,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 19:42:39,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 19:42:41,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 19:42:41,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=469513.3333333333, ans=0.125 2023-09-29 19:42:43,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:43,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 19:42:43,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=469513.3333333333, ans=0.2 2023-09-29 19:42:49,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=469513.3333333333, ans=0.125 2023-09-29 19:42:55,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=469513.3333333333, ans=0.125 2023-09-29 19:42:56,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:57,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=469580.0, ans=0.0 2023-09-29 19:43:07,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:43:07,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:07,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 19:43:11,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:11,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 19:43:11,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=469580.0, ans=0.125 2023-09-29 19:43:12,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:43:12,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:43:14,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:43:17,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 19:43:19,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:43:21,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=469646.6666666667, ans=0.0 2023-09-29 19:43:24,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 19:43:25,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 19:43:26,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=469646.6666666667, ans=0.0 2023-09-29 19:43:31,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 19:43:32,804 INFO [train.py:1039] (2/4) Epoch 14, batch 1400, loss[loss=0.1788, simple_loss=0.2209, pruned_loss=0.06837, over 19129.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.262, pruned_loss=0.05922, over 4676283.02 frames. ], batch size: 388, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:43:32,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:34,252 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.861e+02 2.134e+02 2.363e+02 3.336e+02, threshold=4.269e+02, percent-clipped=0.0 2023-09-29 19:43:36,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:43:37,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:43:38,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=469713.3333333333, ans=0.125 2023-09-29 19:43:39,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=469713.3333333333, ans=0.0 2023-09-29 19:43:43,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 19:43:44,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 19:43:51,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=469780.0, ans=0.035 2023-09-29 19:43:53,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=469780.0, ans=0.125 2023-09-29 19:43:54,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:43:56,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:43:57,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:43:58,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:44:03,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:44:05,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:44:16,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:16,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:21,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 19:44:21,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:44:21,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:44:22,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:44:24,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:44:24,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:44:25,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:44:25,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:44:27,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 19:44:27,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:44:31,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:31,874 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.84 vs. limit=12.0 2023-09-29 19:44:34,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:44:41,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 19:44:42,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:44:44,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:44:46,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:44:48,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:44:49,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:44:52,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:44:53,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:44:53,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:54,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:44:56,253 INFO [train.py:1039] (2/4) Epoch 14, batch 1450, loss[loss=0.1885, simple_loss=0.2655, pruned_loss=0.05576, over 23697.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2623, pruned_loss=0.05905, over 4683843.94 frames. ], batch size: 149, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:44:59,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:00,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:45:01,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:45:01,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 19:45:01,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=470046.6666666667, ans=0.0 2023-09-29 19:45:02,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:45:04,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=470046.6666666667, ans=0.2 2023-09-29 19:45:05,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 19:45:05,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:07,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:07,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 19:45:10,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:10,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:45:12,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 19:45:12,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:13,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:45:15,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:16,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:22,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:45:22,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:45:24,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:24,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:25,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:27,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:45:27,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:28,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:28,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=470180.0, ans=0.125 2023-09-29 19:45:28,836 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.74 vs. limit=15.0 2023-09-29 19:45:31,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 19:45:34,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:37,560 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 19:45:40,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:45:41,104 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.07 vs. limit=15.0 2023-09-29 19:45:41,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:45:43,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:43,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=470246.6666666667, ans=0.125 2023-09-29 19:45:44,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 19:45:50,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:51,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 19:45:51,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 19:45:54,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:55,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=470246.6666666667, ans=0.125 2023-09-29 19:45:57,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:45:59,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:00,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 19:46:04,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 19:46:05,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 19:46:07,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:09,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:46:09,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=470313.3333333333, ans=0.125 2023-09-29 19:46:14,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=470313.3333333333, ans=0.0 2023-09-29 19:46:18,268 INFO [train.py:1039] (2/4) Epoch 14, batch 1500, loss[loss=0.1992, simple_loss=0.2639, pruned_loss=0.06722, over 23847.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2629, pruned_loss=0.05932, over 4695834.88 frames. ], batch size: 195, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:46:19,669 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.877e+02 2.089e+02 2.456e+02 3.474e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 19:46:21,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 19:46:21,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:46:21,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:46:22,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:23,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:25,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:46:26,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 19:46:28,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:46:28,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:46:28,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:30,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:46:30,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:46:32,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:38,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:39,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 19:46:39,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:46:39,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:46:41,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:44,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 19:46:48,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 19:46:50,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:50,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.53 vs. limit=10.0 2023-09-29 19:46:51,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 19:46:53,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:46:56,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:46:57,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:57,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:58,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=470513.3333333333, ans=0.125 2023-09-29 19:46:59,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 19:47:00,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:47:00,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:00,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.00 vs. limit=15.0 2023-09-29 19:47:01,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 19:47:01,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:08,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:47:08,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 19:47:10,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=470580.0, ans=0.0 2023-09-29 19:47:15,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:47:16,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:47:21,551 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 19:47:22,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.84 vs. limit=6.0 2023-09-29 19:47:22,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:22,961 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 19:47:23,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:24,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=470646.6666666667, ans=0.125 2023-09-29 19:47:25,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:47:26,046 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 19:47:26,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=470646.6666666667, ans=0.125 2023-09-29 19:47:27,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:47:29,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 19:47:31,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:32,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:32,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:34,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:34,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:36,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:47:37,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 19:47:38,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 19:47:39,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:47:40,988 INFO [train.py:1039] (2/4) Epoch 14, batch 1550, loss[loss=0.2053, simple_loss=0.276, pruned_loss=0.06736, over 23205.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2632, pruned_loss=0.05931, over 4714474.60 frames. ], batch size: 93, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:47:41,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 19:47:42,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 19:47:44,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:47:46,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:46,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:47:46,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:47:48,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:48,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=470713.3333333333, ans=0.125 2023-09-29 19:47:48,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=470713.3333333333, ans=0.0 2023-09-29 19:47:50,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:53,209 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 19:47:53,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:53,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:47:54,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:47:58,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:47:58,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 19:48:00,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:48:00,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 19:48:02,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 19:48:02,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 19:48:02,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:02,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=470780.0, ans=0.125 2023-09-29 19:48:04,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:08,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:48:10,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 19:48:10,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 19:48:20,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:26,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:48:26,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:48:26,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:48:27,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 19:48:28,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.03 vs. limit=22.5 2023-09-29 19:48:30,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:48:33,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:35,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:48:38,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:48:38,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:39,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 19:48:39,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:48:41,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:48:41,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=470913.3333333333, ans=0.0 2023-09-29 19:48:42,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:44,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 19:48:44,365 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 19:48:47,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:48:51,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 19:48:55,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=470980.0, ans=0.2 2023-09-29 19:48:57,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:48:57,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:59,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 19:49:01,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:49:02,598 INFO [train.py:1039] (2/4) Epoch 14, batch 1600, loss[loss=0.1874, simple_loss=0.2657, pruned_loss=0.05456, over 23932.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2641, pruned_loss=0.05982, over 4717168.02 frames. ], batch size: 86, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:49:02,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:02,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:49:02,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:49:02,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:49:04,177 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.865e+02 2.125e+02 2.416e+02 3.474e+02, threshold=4.250e+02, percent-clipped=0.0 2023-09-29 19:49:05,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:07,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 19:49:08,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 19:49:10,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 19:49:12,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:13,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 19:49:13,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:49:16,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:49:19,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.97 vs. limit=15.0 2023-09-29 19:49:22,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=471113.3333333333, ans=0.125 2023-09-29 19:49:24,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:49:27,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 19:49:30,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:49:32,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 19:49:32,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:32,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 19:49:37,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 19:49:40,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.09 vs. limit=22.5 2023-09-29 19:49:44,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:45,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 19:49:45,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:46,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:46,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:49:48,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 19:49:55,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 19:49:56,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:58,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:58,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:00,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:50:01,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:50:03,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:50:06,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:50:10,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=471313.3333333333, ans=0.125 2023-09-29 19:50:11,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:12,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:50:15,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 19:50:15,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:50:16,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=471313.3333333333, ans=0.1 2023-09-29 19:50:17,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 19:50:20,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=471313.3333333333, ans=0.125 2023-09-29 19:50:22,251 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=471380.0, ans=0.0 2023-09-29 19:50:23,425 INFO [train.py:1039] (2/4) Epoch 14, batch 1650, loss[loss=0.1966, simple_loss=0.2821, pruned_loss=0.05556, over 24314.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2644, pruned_loss=0.05964, over 4710205.46 frames. ], batch size: 74, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:50:23,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:25,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:50:26,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:50:26,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 19:50:26,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 19:50:26,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 19:50:28,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 19:50:29,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:32,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:32,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:50:32,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:50:34,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:37,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 19:50:40,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:50:40,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:40,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:50:40,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:50:42,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 19:50:42,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 19:50:47,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:50:49,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:50:56,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 19:50:58,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:01,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 19:51:01,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=471513.3333333333, ans=0.0 2023-09-29 19:51:04,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:08,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:51:10,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:51:10,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:10,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=471513.3333333333, ans=0.125 2023-09-29 19:51:11,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:51:13,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:14,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:16,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:16,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:16,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:18,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:19,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:51:24,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:25,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 19:51:25,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:27,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 19:51:28,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 19:51:28,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 19:51:28,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:28,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:51:30,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:30,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:30,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 19:51:32,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:34,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:51:36,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:41,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 19:51:44,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:44,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:51:44,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 19:51:46,305 INFO [train.py:1039] (2/4) Epoch 14, batch 1700, loss[loss=0.1705, simple_loss=0.2462, pruned_loss=0.04742, over 24478.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2634, pruned_loss=0.05977, over 4705553.93 frames. ], batch size: 58, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:51:46,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:51:46,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:51:46,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:49,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.869e+02 2.042e+02 2.278e+02 4.402e+02, threshold=4.084e+02, percent-clipped=1.0 2023-09-29 19:51:49,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:51:49,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:51:49,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 19:51:54,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:51:56,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.54 vs. limit=10.0 2023-09-29 19:52:04,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:52:06,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:52:11,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:52:13,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:13,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:52:14,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:17,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 19:52:18,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:52:18,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:18,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=471846.6666666667, ans=0.125 2023-09-29 19:52:20,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:52:21,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:52:23,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 19:52:24,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 19:52:26,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=471846.6666666667, ans=0.125 2023-09-29 19:52:28,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:29,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 19:52:31,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:52:39,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:40,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:40,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:43,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:52:43,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 19:52:43,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:43,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=471913.3333333333, ans=0.125 2023-09-29 19:52:47,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:47,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 19:52:49,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:52:49,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:49,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:49,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:52:50,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:50,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:52:53,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:53,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:52:54,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:57,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:00,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 19:53:01,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:03,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:03,646 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:53:04,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 19:53:07,976 INFO [train.py:1039] (2/4) Epoch 14, batch 1750, loss[loss=0.1642, simple_loss=0.241, pruned_loss=0.04371, over 24247.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2624, pruned_loss=0.0586, over 4720090.58 frames. ], batch size: 56, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:53:11,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:14,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:15,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:53:15,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 19:53:15,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:53:19,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:53:19,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:24,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 19:53:26,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:27,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=472113.3333333333, ans=0.1 2023-09-29 19:53:28,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 19:53:28,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:53:31,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:53:34,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:53:36,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 19:53:38,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:53:38,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 19:53:46,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:53:50,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:53:50,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:50,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=472180.0, ans=0.0 2023-09-29 19:53:53,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:54,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:56,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:56,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:56,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=472246.6666666667, ans=0.125 2023-09-29 19:53:59,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:53:59,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:54:01,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 19:54:05,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:54:06,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 19:54:07,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:08,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:09,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:54:14,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:54:15,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:54:15,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:18,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:22,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.14 vs. limit=15.0 2023-09-29 19:54:23,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:25,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:25,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=472313.3333333333, ans=0.125 2023-09-29 19:54:27,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:54:27,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 19:54:27,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:29,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:54:29,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:29,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:54:29,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:54:31,026 INFO [train.py:1039] (2/4) Epoch 14, batch 1800, loss[loss=0.1948, simple_loss=0.2749, pruned_loss=0.05738, over 23390.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2621, pruned_loss=0.0585, over 4722050.01 frames. ], batch size: 93, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:54:31,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:54:34,554 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.007e+02 2.286e+02 2.732e+02 4.452e+02, threshold=4.572e+02, percent-clipped=3.0 2023-09-29 19:54:34,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:54:34,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:36,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:54:41,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:42,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 19:54:43,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=472380.0, ans=0.125 2023-09-29 19:54:44,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:54:47,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:54:47,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=472446.6666666667, ans=0.2 2023-09-29 19:54:49,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=472446.6666666667, ans=0.2 2023-09-29 19:54:50,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:51,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:53,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:54:54,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:54,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 19:54:55,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:54:55,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=472446.6666666667, ans=0.1 2023-09-29 19:54:59,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:03,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=472513.3333333333, ans=0.1 2023-09-29 19:55:06,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 19:55:07,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 19:55:08,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 19:55:08,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:10,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:55:10,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:10,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:55:15,649 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 19:55:17,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:55:20,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:21,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 19:55:21,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 19:55:21,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:55:23,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:55:24,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:55:29,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 19:55:36,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:55:36,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 19:55:37,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:55:37,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:37,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:55:39,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 19:55:43,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=472646.6666666667, ans=0.0 2023-09-29 19:55:45,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:55:45,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:55:48,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 19:55:48,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:51,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:51,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:55:51,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:52,895 INFO [train.py:1039] (2/4) Epoch 14, batch 1850, loss[loss=0.1753, simple_loss=0.2509, pruned_loss=0.04987, over 24318.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2629, pruned_loss=0.0584, over 4720611.81 frames. ], batch size: 56, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:55:53,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:53,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:55:54,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:56,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:57,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:55:59,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:01,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.04 vs. limit=15.0 2023-09-29 19:56:02,959 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=472713.3333333333, ans=0.2 2023-09-29 19:56:03,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=472713.3333333333, ans=0.125 2023-09-29 19:56:05,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:56:05,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 19:56:10,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 19:56:12,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 19:56:13,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.41 vs. limit=10.0 2023-09-29 19:56:16,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:18,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 19:56:18,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:56:29,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:56:30,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 19:56:34,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:56:34,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:56:39,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 19:56:39,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:41,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:56:43,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:56:45,487 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=472913.3333333333, ans=0.125 2023-09-29 19:56:46,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:48,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:56:51,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:56:53,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:53,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:56:53,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:55,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:56:57,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:57:00,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 19:57:01,522 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.49 vs. limit=10.0 2023-09-29 19:57:01,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:57:04,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:57:06,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:57:06,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 19:57:06,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 19:57:08,012 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 19:57:08,786 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.82 vs. limit=15.0 2023-09-29 19:57:09,545 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 19:57:11,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:57:11,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:57:11,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:12,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:12,701 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 19:57:12,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:57:14,119 INFO [train.py:1039] (2/4) Epoch 14, batch 1900, loss[loss=0.1737, simple_loss=0.2538, pruned_loss=0.04686, over 24489.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2632, pruned_loss=0.05822, over 4725354.46 frames. ], batch size: 66, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:57:14,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:14,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:57:15,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:57:17,954 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.953e+02 2.438e+02 3.060e+02 4.986e+02, threshold=4.875e+02, percent-clipped=3.0 2023-09-29 19:57:18,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:57:18,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 19:57:19,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:19,739 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 19:57:19,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:57:21,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:27,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:30,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:57:32,997 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 19:57:33,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 19:57:34,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:36,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:57:36,078 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 19:57:36,131 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 19:57:36,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=473113.3333333333, ans=0.125 2023-09-29 19:57:41,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 19:57:43,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:57:46,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 19:57:48,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 19:57:48,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=473180.0, ans=0.125 2023-09-29 19:57:51,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=473180.0, ans=0.0 2023-09-29 19:57:58,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 19:57:58,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=473180.0, ans=0.2 2023-09-29 19:58:01,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 19:58:01,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:01,450 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 19:58:01,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 19:58:01,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 19:58:03,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 19:58:03,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:08,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 19:58:11,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:58:13,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:13,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 19:58:16,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:58:19,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 19:58:19,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=473313.3333333333, ans=0.125 2023-09-29 19:58:20,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:27,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:58:27,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:58:27,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:58:27,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:58:29,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:58:29,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 19:58:30,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:58:33,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:33,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:58:35,793 INFO [train.py:1039] (2/4) Epoch 14, batch 1950, loss[loss=0.2404, simple_loss=0.2936, pruned_loss=0.09355, over 19149.00 frames. ], tot_loss[loss=0.192, simple_loss=0.265, pruned_loss=0.05953, over 4718282.41 frames. ], batch size: 388, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:58:35,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:58:35,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:36,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:37,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:42,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:58:44,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:58:45,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:45,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:58:46,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.23 vs. limit=15.0 2023-09-29 19:58:47,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 19:58:48,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:58:48,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:50,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:53,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:58:53,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:58:54,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:57,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:58:59,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:58:59,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:59:01,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:59:01,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:06,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:10,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:59:10,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:10,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:59:10,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 19:59:11,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:59:11,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:59:11,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:13,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=473513.3333333333, ans=0.125 2023-09-29 19:59:14,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:17,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:59:23,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:59:26,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:59:26,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:59:27,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 19:59:27,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:59:27,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=473580.0, ans=0.0 2023-09-29 19:59:28,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=473580.0, ans=0.07 2023-09-29 19:59:32,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:59:33,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:59:33,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:59:43,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:45,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:47,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:49,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:52,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:59:54,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:54,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 19:59:54,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:59:55,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:55,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 19:59:57,169 INFO [train.py:1039] (2/4) Epoch 14, batch 2000, loss[loss=0.2684, simple_loss=0.3213, pruned_loss=0.1078, over 19689.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2652, pruned_loss=0.05966, over 4717044.72 frames. ], batch size: 389, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 19:59:58,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:00,435 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.917e+02 2.206e+02 2.573e+02 3.762e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 20:00:00,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:00:02,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:00:02,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:03,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:00:06,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:11,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 20:00:11,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:00:16,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:00:17,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 20:00:18,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=473780.0, ans=0.125 2023-09-29 20:00:19,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:00:19,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:21,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:00:22,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.00 vs. limit=22.5 2023-09-29 20:00:24,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 20:00:26,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:28,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 20:00:29,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:00:31,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 20:00:31,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:33,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=473846.6666666667, ans=0.0 2023-09-29 20:00:34,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:00:34,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:00:34,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:35,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:37,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:38,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 20:00:40,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 20:00:40,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:40,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:00:45,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:46,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:00:46,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:47,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:48,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=473913.3333333333, ans=0.125 2023-09-29 20:00:49,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:51,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:51,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:51,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:53,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:56,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:57,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 20:01:04,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:01:05,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:01:08,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=473980.0, ans=0.05 2023-09-29 20:01:10,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=473980.0, ans=0.0 2023-09-29 20:01:13,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:14,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:14,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:15,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:01:15,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:01:18,021 INFO [train.py:1039] (2/4) Epoch 14, batch 2050, loss[loss=0.1948, simple_loss=0.2782, pruned_loss=0.05567, over 24449.00 frames. ], tot_loss[loss=0.192, simple_loss=0.2647, pruned_loss=0.05961, over 4707412.17 frames. ], batch size: 66, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:01:18,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:19,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:25,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:25,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:27,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=474046.6666666667, ans=0.2 2023-09-29 20:01:30,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:01:33,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:01:33,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=474113.3333333333, ans=0.0 2023-09-29 20:01:35,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:35,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:01:36,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 20:01:36,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:01:37,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:01:38,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:01:43,733 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.68 vs. limit=15.0 2023-09-29 20:01:48,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:48,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:49,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 20:01:51,765 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-09-29 20:01:53,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:53,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=474180.0, ans=0.125 2023-09-29 20:01:54,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 20:01:55,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:58,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:01,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:03,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:02:03,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:05,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:02:06,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:02:06,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:02:08,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:10,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:02:13,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:02:13,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:02:18,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:22,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:02:22,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 20:02:29,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:02:32,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:02:33,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 20:02:38,870 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 20:02:38,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:38,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:39,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.91 vs. limit=15.0 2023-09-29 20:02:40,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:02:40,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:41,977 INFO [train.py:1039] (2/4) Epoch 14, batch 2100, loss[loss=0.1925, simple_loss=0.2763, pruned_loss=0.05439, over 24339.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2623, pruned_loss=0.059, over 4703563.97 frames. ], batch size: 74, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:02:42,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 20:02:42,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 20:02:43,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:45,129 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.952e+02 2.197e+02 2.435e+02 3.188e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 20:02:46,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:02:48,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:02:50,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:51,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:02:51,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 20:02:52,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:02:54,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 20:02:54,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 20:02:56,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:02:56,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:02:56,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 20:02:58,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:03:05,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 20:03:05,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:03:08,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:03:10,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:03:13,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:03:15,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 20:03:15,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:15,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 20:03:18,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 20:03:18,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:18,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 20:03:18,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 20:03:20,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 20:03:21,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:03:23,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:03:26,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:27,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:29,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:31,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:31,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 20:03:31,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:31,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:31,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=474580.0, ans=0.09899494936611666 2023-09-29 20:03:32,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 20:03:36,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 20:03:36,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 20:03:41,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:03:44,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:03:44,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 20:03:48,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=474646.6666666667, ans=0.0 2023-09-29 20:03:50,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:53,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:03:53,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:03:53,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:03:53,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 20:03:53,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:03:57,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:57,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:03:57,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:03:57,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:59,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 20:04:00,484 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.07 vs. limit=15.0 2023-09-29 20:04:01,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 20:04:01,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:02,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:02,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:04:04,266 INFO [train.py:1039] (2/4) Epoch 14, batch 2150, loss[loss=0.202, simple_loss=0.2758, pruned_loss=0.06405, over 23913.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2623, pruned_loss=0.05853, over 4710829.80 frames. ], batch size: 86, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:04:04,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:04:04,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:04:07,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.99 vs. limit=15.0 2023-09-29 20:04:10,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=474713.3333333333, ans=0.0 2023-09-29 20:04:10,770 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.71 vs. limit=22.5 2023-09-29 20:04:11,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=474713.3333333333, ans=0.0 2023-09-29 20:04:13,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 20:04:15,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:15,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:17,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:04:18,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:18,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:04:22,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:23,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:04:23,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:04:26,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:28,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 20:04:32,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:34,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:04:37,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:37,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:04:38,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:38,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:04:38,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:04:40,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 20:04:42,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:04:42,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:42,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:44,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:04:46,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:04:48,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:50,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:04:51,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:51,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 20:04:51,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:04:53,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:53,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:55,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:56,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:04:58,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:59,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:59,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 20:05:02,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 20:05:02,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:05:02,885 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 20:05:04,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:04,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:05:05,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 20:05:05,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:05:05,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 20:05:05,961 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 20:05:05,962 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 20:05:06,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 20:05:07,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:09,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:05:09,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:05:10,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:12,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:05:13,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:15,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:15,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=474980.0, ans=0.0 2023-09-29 20:05:24,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:05:25,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 20:05:27,309 INFO [train.py:1039] (2/4) Epoch 14, batch 2200, loss[loss=0.2291, simple_loss=0.279, pruned_loss=0.08958, over 19583.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2628, pruned_loss=0.05873, over 4701368.68 frames. ], batch size: 388, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:05:29,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:05:30,438 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.895e+02 2.112e+02 2.594e+02 4.631e+02, threshold=4.225e+02, percent-clipped=1.0 2023-09-29 20:05:35,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:35,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:05:37,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:05:37,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:05:40,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:42,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:05:42,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 20:05:46,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 20:05:48,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:05:53,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 20:05:54,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=475113.3333333333, ans=0.125 2023-09-29 20:05:57,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:58,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:00,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:06:03,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:06:03,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 20:06:03,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=475180.0, ans=0.0 2023-09-29 20:06:05,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:06:07,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:07,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:06:12,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:06:13,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:15,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:06:16,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:18,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 20:06:19,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:23,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 20:06:26,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:26,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:06:26,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:29,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:29,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:29,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:29,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:30,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:06:32,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:06:32,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:06:35,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 20:06:35,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:06:38,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:06:40,828 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 20:06:41,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:06:42,439 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 20:06:43,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:06:44,047 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 20:06:45,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:47,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:06:49,982 INFO [train.py:1039] (2/4) Epoch 14, batch 2250, loss[loss=0.1887, simple_loss=0.268, pruned_loss=0.05469, over 24630.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.263, pruned_loss=0.05888, over 4700521.04 frames. ], batch size: 73, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:06:50,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:51,633 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 20:06:51,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=475380.0, ans=0.125 2023-09-29 20:06:53,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:06:54,092 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.99 vs. limit=15.0 2023-09-29 20:06:56,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:07:04,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:07:05,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:07:08,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:09,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:10,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:07:12,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 20:07:12,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:12,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:07:15,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 20:07:17,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:07:17,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:19,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:22,578 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=475513.3333333333, ans=0.125 2023-09-29 20:07:23,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:24,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:07:24,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:07:27,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 20:07:28,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:30,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:07:35,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:37,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:37,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:07:37,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:37,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=475513.3333333333, ans=0.125 2023-09-29 20:07:42,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:43,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:07:48,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:07:49,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:07:52,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=475580.0, ans=0.125 2023-09-29 20:07:53,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:07:55,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:07:55,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:08:02,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:08:04,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:08:04,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 20:08:04,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:05,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:08:08,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 20:08:12,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:08:12,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:13,521 INFO [train.py:1039] (2/4) Epoch 14, batch 2300, loss[loss=0.198, simple_loss=0.28, pruned_loss=0.05801, over 23962.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.264, pruned_loss=0.05917, over 4713083.14 frames. ], batch size: 86, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:08:16,570 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.985e+02 2.281e+02 2.632e+02 4.053e+02, threshold=4.563e+02, percent-clipped=0.0 2023-09-29 20:08:18,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:19,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:08:21,405 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 20:08:23,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:25,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=475713.3333333333, ans=0.0 2023-09-29 20:08:30,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:08:30,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:08:30,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:08:30,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:30,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 20:08:33,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:08:34,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:08:36,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:08:39,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=475780.0, ans=0.0 2023-09-29 20:08:41,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:08:41,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=475780.0, ans=0.125 2023-09-29 20:08:42,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=475780.0, ans=0.125 2023-09-29 20:08:45,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:08:49,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:08:52,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:08:54,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:57,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:08:58,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:09:04,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:09:05,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:09:05,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:09:05,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 20:09:10,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:09:10,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:10,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:10,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:09:11,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:11,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:09:11,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:09:12,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 20:09:13,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:09:13,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:13,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 20:09:22,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:09:22,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=475980.0, ans=0.2 2023-09-29 20:09:27,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:09:27,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=475980.0, ans=0.0 2023-09-29 20:09:28,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=475980.0, ans=0.125 2023-09-29 20:09:31,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:31,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:09:31,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:09:33,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:09:33,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:09:34,801 INFO [train.py:1039] (2/4) Epoch 14, batch 2350, loss[loss=0.1822, simple_loss=0.2501, pruned_loss=0.05718, over 19014.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2652, pruned_loss=0.06014, over 4709039.26 frames. ], batch size: 41, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:09:34,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:09:36,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 20:09:40,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:09:40,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 20:09:46,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 20:09:49,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:55,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:09:55,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:09:55,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 20:10:00,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:10:07,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 20:10:09,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:10:10,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:10:10,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:10:15,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:10:16,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=476180.0, ans=0.2 2023-09-29 20:10:17,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 20:10:17,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:10:19,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:10:20,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:20,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:10:25,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:10:27,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 20:10:28,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:10:30,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:10:30,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:10:32,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 20:10:34,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:10:35,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 20:10:37,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:10:42,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 20:10:43,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 20:10:45,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:45,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:10:45,384 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 20:10:46,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 20:10:48,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 20:10:51,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:10:55,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:10:55,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=476380.0, ans=0.0 2023-09-29 20:10:56,574 INFO [train.py:1039] (2/4) Epoch 14, batch 2400, loss[loss=0.2079, simple_loss=0.2882, pruned_loss=0.06382, over 24438.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2652, pruned_loss=0.06002, over 4715378.94 frames. ], batch size: 69, lr: 7.41e-03, grad_scale: 32.0 2023-09-29 20:10:59,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=476380.0, ans=0.125 2023-09-29 20:10:59,964 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.908e+02 2.123e+02 2.498e+02 3.353e+02, threshold=4.247e+02, percent-clipped=0.0 2023-09-29 20:11:00,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:11:01,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:11:01,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 20:11:03,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 20:11:11,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:11:11,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:11:13,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 20:11:14,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:11:16,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:16,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 20:11:22,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:24,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 20:11:29,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:11:32,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=476513.3333333333, ans=0.1 2023-09-29 20:11:35,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 20:11:39,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:11:40,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:45,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:11:47,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 20:11:47,443 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=476580.0, ans=0.0 2023-09-29 20:11:48,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:11:53,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:11:56,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:11:57,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=476580.0, ans=0.0 2023-09-29 20:11:58,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=476580.0, ans=0.125 2023-09-29 20:12:01,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:01,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:12:01,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:12:02,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:12:03,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:03,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:03,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:12:05,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=476646.6666666667, ans=0.125 2023-09-29 20:12:09,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:09,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:12:09,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 20:12:10,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 20:12:12,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:12:12,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:14,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 20:12:15,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 20:12:15,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 20:12:15,452 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 20:12:15,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=476646.6666666667, ans=0.125 2023-09-29 20:12:16,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 20:12:18,292 INFO [train.py:1039] (2/4) Epoch 14, batch 2450, loss[loss=0.2029, simple_loss=0.2812, pruned_loss=0.06228, over 23919.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.264, pruned_loss=0.05953, over 4705846.03 frames. ], batch size: 86, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:12:18,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:12:19,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:19,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:21,464 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 20:12:23,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:23,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:12:24,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:12:24,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:28,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:28,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:29,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 20:12:36,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:12:36,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:37,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:12:39,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:12:39,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:12:39,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 20:12:43,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=476780.0, ans=0.0 2023-09-29 20:12:46,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:47,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:12:47,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:51,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.92 vs. limit=15.0 2023-09-29 20:12:52,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:12:52,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:52,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:54,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:54,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 20:12:55,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:13:03,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:05,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:13:05,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:05,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:13:07,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:07,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:13:08,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 20:13:09,591 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.25 vs. limit=15.0 2023-09-29 20:13:14,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:13:14,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:13:17,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:13:17,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:22,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:13:22,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 20:13:23,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:13:25,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:13:25,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 20:13:25,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:13:26,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:13:31,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:13:32,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:32,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:13:37,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 20:13:38,878 INFO [train.py:1039] (2/4) Epoch 14, batch 2500, loss[loss=0.168, simple_loss=0.2528, pruned_loss=0.04156, over 24262.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2619, pruned_loss=0.05879, over 4702554.04 frames. ], batch size: 61, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:13:39,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:13:44,476 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.907e+02 2.163e+02 2.456e+02 3.959e+02, threshold=4.326e+02, percent-clipped=0.0 2023-09-29 20:13:48,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:13:52,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=477046.6666666667, ans=0.125 2023-09-29 20:13:57,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=477113.3333333333, ans=0.125 2023-09-29 20:13:57,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=477113.3333333333, ans=0.0 2023-09-29 20:13:58,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:13:58,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:14:00,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:14:00,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 20:14:04,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:14:05,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:06,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:14:06,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:14:08,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 20:14:08,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=477113.3333333333, ans=0.0 2023-09-29 20:14:09,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:10,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.02 vs. limit=6.0 2023-09-29 20:14:10,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:10,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 20:14:10,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:12,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 20:14:12,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:15,013 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-09-29 20:14:16,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:14:18,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:23,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:14:25,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 20:14:27,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:14:28,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=477246.6666666667, ans=0.125 2023-09-29 20:14:30,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:33,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:36,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:39,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:43,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:14:47,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 20:14:47,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:47,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:14:50,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:14:50,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:14:52,849 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 20:14:52,850 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 20:14:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 20:14:54,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:58,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 20:14:58,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 20:14:59,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:59,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 20:15:03,145 INFO [train.py:1039] (2/4) Epoch 14, batch 2550, loss[loss=0.1834, simple_loss=0.2564, pruned_loss=0.05519, over 23405.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2624, pruned_loss=0.05845, over 4706289.92 frames. ], batch size: 134, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:15:04,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 20:15:07,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:09,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:15:09,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:15:12,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:12,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 20:15:12,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:15:14,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=477380.0, ans=0.0 2023-09-29 20:15:15,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 20:15:17,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:15:19,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:22,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:15:22,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 20:15:22,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=477446.6666666667, ans=0.125 2023-09-29 20:15:23,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:25,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:25,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:29,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:15:29,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 20:15:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:15:30,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:30,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 20:15:36,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=477513.3333333333, ans=0.125 2023-09-29 20:15:45,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:15:48,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:15:48,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:48,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:50,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:15:56,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:58,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:58,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:16:00,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:16:00,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:16:00,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:16:03,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:04,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:08,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:16:09,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 20:16:09,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:16:10,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:11,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:16:12,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:16:13,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:20,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:16:22,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:24,920 INFO [train.py:1039] (2/4) Epoch 14, batch 2600, loss[loss=0.2018, simple_loss=0.2661, pruned_loss=0.06872, over 23810.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2635, pruned_loss=0.0594, over 4692024.82 frames. ], batch size: 164, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:16:25,137 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 20:16:28,258 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 20:16:28,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:16:28,358 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 20:16:29,686 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.879e+02 2.138e+02 2.500e+02 3.129e+02, threshold=4.275e+02, percent-clipped=0.0 2023-09-29 20:16:29,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 20:16:29,851 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 20:16:32,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:32,789 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 20:16:36,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 20:16:37,630 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 20:16:40,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:16:44,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 20:16:45,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 20:16:47,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:16:47,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 20:16:49,811 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.03 vs. limit=15.0 2023-09-29 20:16:50,652 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 20:16:50,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 20:16:57,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:16:57,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:57,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:16:57,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 20:16:58,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:17:00,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=477846.6666666667, ans=0.125 2023-09-29 20:17:03,636 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 20:17:11,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:11,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:13,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 20:17:14,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:14,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:17:14,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 20:17:17,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=477913.3333333333, ans=0.125 2023-09-29 20:17:19,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:17:19,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:17:21,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:24,545 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 20:17:25,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:26,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:17:31,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:32,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:17:32,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 20:17:33,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:36,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:17:36,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:17:41,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=477980.0, ans=0.125 2023-09-29 20:17:44,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 20:17:46,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:49,002 INFO [train.py:1039] (2/4) Epoch 14, batch 2650, loss[loss=0.1889, simple_loss=0.2739, pruned_loss=0.05199, over 24642.00 frames. ], tot_loss[loss=0.192, simple_loss=0.2648, pruned_loss=0.0596, over 4703475.34 frames. ], batch size: 68, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:17:50,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:17:54,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 20:17:55,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:57,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:17:58,739 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 20:17:58,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:17:59,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=478046.6666666667, ans=0.125 2023-09-29 20:18:01,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:03,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:18:04,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:18:07,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:18:08,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=478113.3333333333, ans=0.1 2023-09-29 20:18:09,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 20:18:09,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:18:10,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:18:11,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=478113.3333333333, ans=0.2 2023-09-29 20:18:12,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 20:18:12,754 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 20:18:17,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:17,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=478113.3333333333, ans=0.2 2023-09-29 20:18:18,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 20:18:18,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:20,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 20:18:24,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:24,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:18:24,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:24,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:31,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 20:18:31,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 20:18:34,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:18:36,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=478246.6666666667, ans=0.0 2023-09-29 20:18:39,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 20:18:39,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:40,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:40,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:18:40,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:42,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:43,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:46,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:47,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:47,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:18:48,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:18:50,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:50,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:18:50,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:53,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:53,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:18:58,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:58,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=478313.3333333333, ans=0.1 2023-09-29 20:19:00,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:19:00,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:02,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 20:19:05,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:07,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:07,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:08,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:10,263 INFO [train.py:1039] (2/4) Epoch 14, batch 2700, loss[loss=0.1652, simple_loss=0.2422, pruned_loss=0.04412, over 24332.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2643, pruned_loss=0.05905, over 4714608.71 frames. ], batch size: 61, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:19:10,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:19:10,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:13,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:13,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 20:19:14,653 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.969e+02 2.229e+02 2.662e+02 4.082e+02, threshold=4.458e+02, percent-clipped=0.0 2023-09-29 20:19:16,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:19:18,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:19:19,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:19:21,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:21,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:22,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:19:22,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:23,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:19:23,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:19:24,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 20:19:25,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:19:25,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:19:25,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=478446.6666666667, ans=0.0 2023-09-29 20:19:27,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:19:28,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:32,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:19:32,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 20:19:33,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:19:38,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:19:38,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:19:38,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=478446.6666666667, ans=0.1 2023-09-29 20:19:44,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:19:44,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:44,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:19:44,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:19:44,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=478513.3333333333, ans=0.125 2023-09-29 20:19:44,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=478513.3333333333, ans=0.125 2023-09-29 20:19:47,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:19:49,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:49,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:19:49,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:19:52,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=478513.3333333333, ans=0.2 2023-09-29 20:19:56,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:56,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:19:56,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=478580.0, ans=0.125 2023-09-29 20:20:06,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:20:08,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:12,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:20:12,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:15,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:16,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:17,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:20:17,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=478646.6666666667, ans=0.125 2023-09-29 20:20:18,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:20,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:20,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:23,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:20:25,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:25,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:26,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 20:20:26,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:30,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:20:30,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 20:20:31,469 INFO [train.py:1039] (2/4) Epoch 14, batch 2750, loss[loss=0.1732, simple_loss=0.2482, pruned_loss=0.04911, over 22366.00 frames. ], tot_loss[loss=0.1913, simple_loss=0.264, pruned_loss=0.0593, over 4712019.63 frames. ], batch size: 49, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:20:31,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 20:20:31,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:35,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:35,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:37,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:39,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:20:39,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:44,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:20:46,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:20:46,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:20:46,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:46,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 20:20:46,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:20:46,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:52,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 20:20:54,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:55,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:55,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:55,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:20:57,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:57,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:20:57,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:58,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:03,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:21:03,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:21:05,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:21:06,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:08,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:21:08,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=478846.6666666667, ans=0.2 2023-09-29 20:21:11,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=478846.6666666667, ans=0.1 2023-09-29 20:21:17,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:18,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:21:18,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:22,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:22,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:21:24,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:21:31,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:21:31,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:21:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 20:21:36,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:37,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 20:21:43,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:21:45,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:21:46,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 20:21:47,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:21:49,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:21:49,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 20:21:49,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:21:49,807 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=478980.0, ans=0.125 2023-09-29 20:21:53,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:21:54,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:21:54,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:21:56,023 INFO [train.py:1039] (2/4) Epoch 14, batch 2800, loss[loss=0.196, simple_loss=0.2594, pruned_loss=0.06634, over 23846.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2628, pruned_loss=0.05868, over 4716168.01 frames. ], batch size: 212, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:21:56,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 20:21:56,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:21:57,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:59,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:00,642 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 20:22:00,643 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 20:22:03,504 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.993e+02 2.230e+02 2.590e+02 3.913e+02, threshold=4.460e+02, percent-clipped=0.0 2023-09-29 20:22:05,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:06,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:22:06,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:22:10,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:22:11,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 20:22:15,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:22:16,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 20:22:16,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:18,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:22:18,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:24,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:25,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:25,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:22:25,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:22:31,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=479180.0, ans=0.1 2023-09-29 20:22:34,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:22:35,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:38,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:38,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:22:40,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:45,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:22:45,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 20:22:46,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:48,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:48,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:22:51,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:53,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:57,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:23:00,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:23:00,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:00,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:23:01,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:23:02,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:23:02,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:23:04,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 20:23:04,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:04,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:23:05,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:07,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 20:23:07,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:08,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:23:08,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:23:10,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 20:23:12,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=479313.3333333333, ans=0.125 2023-09-29 20:23:15,429 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=479313.3333333333, ans=0.2 2023-09-29 20:23:16,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:23:16,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:23:16,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:23:18,210 INFO [train.py:1039] (2/4) Epoch 14, batch 2850, loss[loss=0.1894, simple_loss=0.2484, pruned_loss=0.06517, over 23683.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.261, pruned_loss=0.05811, over 4708784.81 frames. ], batch size: 232, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:23:19,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:23,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:23:24,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:23:24,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:23:28,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:28,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:30,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:23:32,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 20:23:38,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 20:23:38,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:40,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 20:23:40,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:44,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 20:23:46,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 20:23:47,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:48,724 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.95 vs. limit=15.0 2023-09-29 20:23:54,774 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:23:59,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:01,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:01,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:24:01,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:24:02,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:24:02,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:24:05,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:24:05,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 20:24:08,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:24:10,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:10,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:10,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:13,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:14,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:15,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:17,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:18,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:24:20,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:21,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:24,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:24:27,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:24:31,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 20:24:31,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 20:24:33,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:24:34,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:34,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 20:24:34,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:24:36,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:36,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:36,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:24:36,387 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 20:24:36,440 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 20:24:36,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:24:38,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:41,478 INFO [train.py:1039] (2/4) Epoch 14, batch 2900, loss[loss=0.193, simple_loss=0.2653, pruned_loss=0.06031, over 23824.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2609, pruned_loss=0.05761, over 4721810.12 frames. ], batch size: 212, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:24:45,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:24:45,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:46,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:46,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 20:24:50,315 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.184e+02 2.656e+02 3.783e+02, threshold=4.367e+02, percent-clipped=0.0 2023-09-29 20:24:52,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:52,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 20:24:53,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 20:24:54,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:24:54,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:24:56,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=479780.0, ans=0.5 2023-09-29 20:24:58,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:59,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:25:02,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:25:02,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:25:06,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:25:06,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 20:25:07,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:25:09,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:13,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 20:25:14,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 20:25:17,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:25:17,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 20:25:17,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:25:21,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:25:21,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:25:22,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:25:24,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:27,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:25:30,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:25:32,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 20:25:32,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 20:25:32,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:25:37,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:25:40,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 20:25:42,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:25:47,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:47,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=479980.0, ans=0.1 2023-09-29 20:25:59,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:25:59,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:26:01,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 20:26:05,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:05,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 20:26:06,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:06,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:26:07,888 INFO [train.py:1039] (2/4) Epoch 14, batch 2950, loss[loss=0.1927, simple_loss=0.2694, pruned_loss=0.05795, over 23352.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2626, pruned_loss=0.05828, over 4722161.82 frames. ], batch size: 105, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:26:08,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=480046.6666666667, ans=0.125 2023-09-29 20:26:11,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:12,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 20:26:14,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:14,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:14,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:26:16,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:26:18,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 20:26:18,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 20:26:21,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:26:21,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:29,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:31,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:26:33,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:26:34,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:37,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:26:37,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:26:39,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:26:42,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 20:26:44,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=480180.0, ans=0.09899494936611666 2023-09-29 20:26:47,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 20:26:48,914 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 20:26:50,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:26:52,796 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 20:26:53,175 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=480180.0, ans=0.125 2023-09-29 20:26:54,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 20:26:54,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:54,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:54,613 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 20:26:54,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:26:57,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 20:27:01,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:27:01,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:27:04,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:06,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:27:06,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:06,993 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 20:27:08,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:08,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 20:27:13,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:13,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:15,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 20:27:15,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:27:15,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 20:27:18,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:21,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:27:21,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:27:23,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:23,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:27:24,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:27:24,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:24,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:27:24,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:27:26,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:28,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:27:29,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:29,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 20:27:31,335 INFO [train.py:1039] (2/4) Epoch 14, batch 3000, loss[loss=0.2003, simple_loss=0.2609, pruned_loss=0.06982, over 23701.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2637, pruned_loss=0.05887, over 4711097.15 frames. ], batch size: 232, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:27:31,336 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 20:27:46,675 INFO [train.py:1071] (2/4) Epoch 14, validation: loss=0.2839, simple_loss=0.2749, pruned_loss=0.1465, over 1125622.00 frames. 2023-09-29 20:27:46,676 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-29 20:27:46,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:48,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:27:49,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:27:51,609 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 20:27:53,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 20:27:54,614 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.901e+02 2.128e+02 2.266e+02 3.715e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 20:27:54,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:55,152 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:27:56,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:27:56,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 20:27:56,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:27:58,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=480380.0, ans=0.125 2023-09-29 20:28:04,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:28:08,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=480446.6666666667, ans=0.1 2023-09-29 20:28:15,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.28 vs. limit=15.0 2023-09-29 20:28:17,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:28:17,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=480513.3333333333, ans=0.035 2023-09-29 20:28:21,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=480513.3333333333, ans=0.125 2023-09-29 20:28:22,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 20:28:24,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:28:25,123 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.39 vs. limit=12.0 2023-09-29 20:28:26,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:28:26,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:28:26,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:28:29,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:29,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 20:28:32,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 20:28:33,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:28:35,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:28:37,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:28:38,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:38,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:38,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:28:41,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:28:42,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:42,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:28:44,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:47,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 20:28:49,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:28:50,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:28:50,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:28:51,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=480646.6666666667, ans=0.125 2023-09-29 20:28:53,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:53,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:56,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:28:56,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 20:28:57,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:28:57,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 20:28:57,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:28:59,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 20:29:02,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:03,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:29:03,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 20:29:05,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 20:29:05,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:29:05,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:29:06,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:29:08,215 INFO [train.py:1039] (2/4) Epoch 14, batch 3050, loss[loss=0.1988, simple_loss=0.2605, pruned_loss=0.06851, over 23809.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2647, pruned_loss=0.05959, over 4706165.57 frames. ], batch size: 179, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:29:08,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:29:08,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:08,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:29:13,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 20:29:15,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:18,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:18,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:29:20,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=480713.3333333333, ans=0.125 2023-09-29 20:29:21,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=480713.3333333333, ans=0.0 2023-09-29 20:29:23,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:27,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 20:29:30,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 20:29:33,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 20:29:33,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:36,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:29:37,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:37,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:39,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:39,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=480846.6666666667, ans=0.125 2023-09-29 20:29:44,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:29:44,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:44,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:46,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:46,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:48,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:49,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:49,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=480846.6666666667, ans=0.125 2023-09-29 20:29:52,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:52,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 20:29:54,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:54,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:29:57,084 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=12.0 2023-09-29 20:29:57,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:57,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:29:59,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:29:59,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:07,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:30:07,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:13,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:13,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:30:13,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:30:16,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:18,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:30:18,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:30:20,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 20:30:22,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:22,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:22,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 20:30:25,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:30,453 INFO [train.py:1039] (2/4) Epoch 14, batch 3100, loss[loss=0.2012, simple_loss=0.2805, pruned_loss=0.06098, over 23698.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2639, pruned_loss=0.059, over 4713228.44 frames. ], batch size: 85, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:30:32,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:33,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.71 vs. limit=15.0 2023-09-29 20:30:34,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:30:35,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:30:38,647 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.952e+02 2.226e+02 2.517e+02 3.865e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 20:30:38,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 20:30:41,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 20:30:42,627 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.87 vs. limit=15.0 2023-09-29 20:30:43,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 20:30:43,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:30:45,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:30:45,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:50,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:30:51,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.77 vs. limit=15.0 2023-09-29 20:30:54,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:59,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 20:31:01,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=481113.3333333333, ans=0.09899494936611666 2023-09-29 20:31:06,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:31:07,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:07,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:07,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:08,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:31:10,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:31:10,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 20:31:10,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:31:12,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:13,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 20:31:15,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:31:16,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=481180.0, ans=0.125 2023-09-29 20:31:18,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:31:20,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 20:31:20,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 20:31:21,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:23,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:24,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:25,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:26,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:31:26,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:31:26,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:31:28,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:31:29,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=481246.6666666667, ans=0.0 2023-09-29 20:31:30,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:31:30,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:30,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 20:31:35,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:37,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 20:31:39,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:31:39,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 20:31:40,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:42,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:42,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 20:31:42,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=481313.3333333333, ans=0.0 2023-09-29 20:31:53,119 INFO [train.py:1039] (2/4) Epoch 14, batch 3150, loss[loss=0.2035, simple_loss=0.2685, pruned_loss=0.06929, over 23829.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2621, pruned_loss=0.05854, over 4700424.79 frames. ], batch size: 179, lr: 7.37e-03, grad_scale: 8.0 2023-09-29 20:31:53,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 20:31:56,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:31:57,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:58,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:58,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:32:00,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 20:32:02,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:02,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:32:03,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 20:32:05,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:09,046 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 20:32:12,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 20:32:12,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:32:13,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=481446.6666666667, ans=0.125 2023-09-29 20:32:14,133 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 20:32:14,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:32:15,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 20:32:16,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=481446.6666666667, ans=0.125 2023-09-29 20:32:17,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 20:32:17,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 20:32:17,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:17,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:18,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:20,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 20:32:21,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:21,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:23,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:23,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:32:28,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 20:32:28,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:32:29,045 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=481513.3333333333, ans=0.0 2023-09-29 20:32:29,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.17 vs. limit=15.0 2023-09-29 20:32:31,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:32:31,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:33,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 20:32:38,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 20:32:38,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:32:39,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:32:39,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:32:40,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:32:40,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:32:40,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:32:40,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:32:43,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 20:32:43,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:32:43,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:46,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=481580.0, ans=0.125 2023-09-29 20:32:47,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:32:47,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:47,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 20:32:47,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:49,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 20:32:49,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:50,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 20:32:50,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 20:32:53,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:32:53,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:55,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 20:32:56,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:32:56,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:33:01,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:33:01,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:01,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:33:09,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:33:09,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:11,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 20:33:16,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:33:16,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:33:18,094 INFO [train.py:1039] (2/4) Epoch 14, batch 3200, loss[loss=0.2043, simple_loss=0.2763, pruned_loss=0.06616, over 23398.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.261, pruned_loss=0.05775, over 4706201.71 frames. ], batch size: 93, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:33:20,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:20,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:33:20,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 20:33:24,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:33:26,519 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.865e+02 2.106e+02 2.358e+02 3.213e+02, threshold=4.213e+02, percent-clipped=0.0 2023-09-29 20:33:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:33:31,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:41,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:33:52,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 20:33:53,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:33:56,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 20:33:56,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:34:01,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:34:01,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:34:03,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:34:07,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 20:34:09,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:34:09,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 20:34:12,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 20:34:15,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:34:21,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:21,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:34:23,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:23,499 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 20:34:23,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:34:30,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:34:31,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 20:34:33,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 20:34:34,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 20:34:36,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 20:34:37,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:34:40,980 INFO [train.py:1039] (2/4) Epoch 14, batch 3250, loss[loss=0.1838, simple_loss=0.2573, pruned_loss=0.05515, over 18003.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2615, pruned_loss=0.05823, over 4691839.32 frames. ], batch size: 39, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:34:41,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:34:41,125 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 20:34:41,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:34:41,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:34:42,694 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 20:34:46,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:34:48,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:34:58,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:34:58,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 20:35:00,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:02,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:02,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:02,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:03,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:35:05,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:35:06,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:06,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:08,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:09,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:11,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:14,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:14,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:16,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:16,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:17,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:21,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 20:35:23,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:35:23,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:35:25,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:27,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:35:33,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:35:41,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:35:42,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:42,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 20:35:42,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:35:42,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:35:42,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:44,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 20:35:44,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 20:35:45,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:46,429 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.69 vs. limit=12.0 2023-09-29 20:35:47,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:47,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:48,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:35:48,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:53,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:53,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:56,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 20:35:56,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:00,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:36:00,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 20:36:03,604 INFO [train.py:1039] (2/4) Epoch 14, batch 3300, loss[loss=0.2012, simple_loss=0.2646, pruned_loss=0.06894, over 23478.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2626, pruned_loss=0.05907, over 4687548.89 frames. ], batch size: 134, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:36:03,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:36:03,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 20:36:06,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 20:36:07,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 20:36:07,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:11,794 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.874e+02 2.156e+02 2.504e+02 3.971e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 20:36:12,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:36:12,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=482380.0, ans=0.125 2023-09-29 20:36:13,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:36:13,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:16,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:36:16,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:36:18,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:21,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:36:25,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 20:36:26,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:26,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:27,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:27,729 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 20:36:29,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:36:30,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:36:30,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:36:30,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:36:30,963 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 20:36:34,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:34,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:36:35,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=482513.3333333333, ans=10.0 2023-09-29 20:36:35,607 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.10 vs. limit=6.0 2023-09-29 20:36:37,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:37,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 20:36:37,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=482513.3333333333, ans=0.125 2023-09-29 20:36:38,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 20:36:39,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:40,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:36:43,572 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 20:36:45,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 20:36:45,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:36:46,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 20:36:49,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:36:52,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:36:54,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:36:57,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:57,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:57,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:57,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:37:00,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:37:00,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:01,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:37:02,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=482580.0, ans=0.0 2023-09-29 20:37:03,257 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 20:37:03,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 20:37:05,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:37:07,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:07,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:09,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:37:09,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:11,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:37:11,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:12,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:37:14,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:16,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:37:19,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 20:37:21,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:21,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:24,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:37:24,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:37:24,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:24,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=482713.3333333333, ans=0.1 2023-09-29 20:37:25,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.06 vs. limit=6.0 2023-09-29 20:37:26,184 INFO [train.py:1039] (2/4) Epoch 14, batch 3350, loss[loss=0.2203, simple_loss=0.2823, pruned_loss=0.07916, over 23758.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2629, pruned_loss=0.05903, over 4688564.07 frames. ], batch size: 195, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:37:27,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:27,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:30,110 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.81 vs. limit=15.0 2023-09-29 20:37:30,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:37:33,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:35,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:37:38,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:38,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=482713.3333333333, ans=0.125 2023-09-29 20:37:40,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:37:42,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:42,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:37:44,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 20:37:45,582 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 20:37:47,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:49,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 20:37:49,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 20:37:49,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:37:50,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:37:50,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:53,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 20:37:53,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:53,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:37:56,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:58,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:58,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:00,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:38:01,162 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-09-29 20:38:03,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:06,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:06,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:07,672 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.22 vs. limit=15.0 2023-09-29 20:38:08,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=482846.6666666667, ans=0.2 2023-09-29 20:38:11,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:38:12,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:14,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:16,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:17,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:19,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 20:38:19,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:38:21,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 20:38:21,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:38:23,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 20:38:23,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:24,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:32,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:33,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 20:38:33,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:38:35,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:38:35,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:38:41,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:38:43,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 20:38:43,915 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.32 vs. limit=22.5 2023-09-29 20:38:44,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:38:44,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:38:46,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:46,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 20:38:48,226 INFO [train.py:1039] (2/4) Epoch 14, batch 3400, loss[loss=0.1854, simple_loss=0.2743, pruned_loss=0.04831, over 24664.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2641, pruned_loss=0.05968, over 4693030.94 frames. ], batch size: 73, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:38:48,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:48,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 20:38:49,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:49,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:50,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=483046.6666666667, ans=0.125 2023-09-29 20:38:51,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:38:53,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:38:53,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 20:38:56,128 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.936e+02 2.106e+02 2.472e+02 5.174e+02, threshold=4.212e+02, percent-clipped=2.0 2023-09-29 20:38:58,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 20:38:58,494 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 20:38:58,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:02,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:39:02,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:39:03,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:05,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:39:12,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:14,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 20:39:18,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:39:21,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:23,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:25,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:39:30,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:39:35,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 20:39:40,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 20:39:42,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:39:43,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:44,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:45,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:39:46,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:49,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:39:49,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:39:53,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=483313.3333333333, ans=0.0 2023-09-29 20:39:54,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:39:58,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 20:40:04,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:40:09,465 INFO [train.py:1039] (2/4) Epoch 14, batch 3450, loss[loss=0.2097, simple_loss=0.287, pruned_loss=0.06614, over 23815.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2644, pruned_loss=0.05916, over 4707068.17 frames. ], batch size: 85, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:40:11,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 20:40:14,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 20:40:14,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:15,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=483380.0, ans=0.1 2023-09-29 20:40:16,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:40:16,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 20:40:17,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:40:21,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:40:24,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:40:25,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:27,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:40:27,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:28,027 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.26 vs. limit=15.0 2023-09-29 20:40:30,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:37,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 20:40:44,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 20:40:44,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:40:44,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:40:46,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:51,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 20:40:51,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:40:53,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=483513.3333333333, ans=0.07 2023-09-29 20:40:56,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:40:56,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:56,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=483513.3333333333, ans=0.125 2023-09-29 20:40:57,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:40:59,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:41:01,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 20:41:01,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:03,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:41:03,864 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.13 vs. limit=15.0 2023-09-29 20:41:08,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:11,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 20:41:14,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:41:17,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=483646.6666666667, ans=0.0 2023-09-29 20:41:18,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:41:21,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:23,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:28,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:28,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:41:29,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:41:29,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:32,570 INFO [train.py:1039] (2/4) Epoch 14, batch 3500, loss[loss=0.1664, simple_loss=0.2409, pruned_loss=0.04589, over 21151.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2627, pruned_loss=0.05811, over 4710859.59 frames. ], batch size: 46, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:41:32,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=483713.3333333333, ans=0.125 2023-09-29 20:41:34,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:38,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:41:38,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 20:41:39,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=483713.3333333333, ans=0.125 2023-09-29 20:41:41,684 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.028e+02 2.360e+02 2.884e+02 5.509e+02, threshold=4.720e+02, percent-clipped=5.0 2023-09-29 20:41:41,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:41:42,672 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-09-29 20:41:44,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:41:46,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=483713.3333333333, ans=0.0 2023-09-29 20:41:49,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:49,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 20:41:54,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:41:54,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:56,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:41:56,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:41:56,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:41:58,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:58,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:41:59,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 20:42:02,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:02,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:42:04,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:04,897 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:42:07,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:09,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 20:42:09,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:42:13,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:16,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:42:18,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:19,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:42:19,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:21,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 20:42:23,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 20:42:23,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 20:42:24,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:26,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:26,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:28,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:42:30,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:42:30,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:42:34,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=483913.3333333333, ans=0.125 2023-09-29 20:42:35,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:42:35,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=483913.3333333333, ans=0.125 2023-09-29 20:42:38,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 20:42:38,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 20:42:38,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:42:40,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:40,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:40,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=483980.0, ans=0.1 2023-09-29 20:42:41,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:46,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 20:42:46,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:47,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:49,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 20:42:52,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 20:42:55,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:55,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:55,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:42:55,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:42:55,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=484046.6666666667, ans=0.0 2023-09-29 20:42:56,645 INFO [train.py:1039] (2/4) Epoch 14, batch 3550, loss[loss=0.1862, simple_loss=0.2723, pruned_loss=0.05005, over 24432.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2618, pruned_loss=0.05736, over 4719080.07 frames. ], batch size: 69, lr: 7.35e-03, grad_scale: 16.0 2023-09-29 20:42:59,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:43:06,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=484046.6666666667, ans=0.125 2023-09-29 20:43:11,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:12,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:43:14,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:16,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:43:16,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:18,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:43:18,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:43:22,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:23,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:43:23,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:23,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:43:25,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:43:25,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=484113.3333333333, ans=0.125 2023-09-29 20:43:32,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:43:32,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:34,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:34,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:36,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:43:36,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 20:43:36,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:37,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:39,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:43:43,965 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-09-29 20:43:46,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:46,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:47,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:49,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 20:43:51,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:43:51,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 20:43:52,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:54,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:43:54,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:43:55,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=484246.6666666667, ans=0.1 2023-09-29 20:43:57,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 20:43:59,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:04,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:04,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 20:44:06,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:10,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:44:11,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=484313.3333333333, ans=10.0 2023-09-29 20:44:12,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 20:44:19,143 INFO [train.py:1039] (2/4) Epoch 14, batch 3600, loss[loss=0.173, simple_loss=0.2414, pruned_loss=0.05235, over 24414.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2616, pruned_loss=0.05765, over 4721187.42 frames. ], batch size: 58, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:44:19,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 20:44:19,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:44:20,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:44:21,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=484380.0, ans=0.0 2023-09-29 20:44:22,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:22,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=484380.0, ans=0.125 2023-09-29 20:44:23,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:24,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:44:27,542 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.842e+02 2.079e+02 2.493e+02 3.972e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-29 20:44:28,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=484380.0, ans=0.125 2023-09-29 20:44:31,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:32,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:34,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:44:35,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:44:37,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:37,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 20:44:40,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:44:40,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:44,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:48,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:44:48,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:44:49,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:49,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 20:44:50,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:53,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:55,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:44:57,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:59,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:45:01,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:01,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 20:45:08,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=484580.0, ans=0.2 2023-09-29 20:45:09,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:10,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:45:10,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 20:45:15,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:45:20,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:23,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:29,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:45:30,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:45:30,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 20:45:32,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 20:45:32,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 20:45:35,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:35,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:45:35,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 20:45:37,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:45:37,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:45:37,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:37,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 20:45:38,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 20:45:42,496 INFO [train.py:1039] (2/4) Epoch 14, batch 3650, loss[loss=0.1604, simple_loss=0.2346, pruned_loss=0.04316, over 24345.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2627, pruned_loss=0.05788, over 4716216.35 frames. ], batch size: 56, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:45:42,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:44,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 20:45:49,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 20:45:50,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:45:54,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 20:45:55,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 20:46:00,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:00,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:46:02,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:46:06,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:46:06,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:46:07,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 20:46:07,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:46:09,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:09,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 20:46:10,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:46:11,508 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.80 vs. limit=10.0 2023-09-29 20:46:12,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:12,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:13,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:46:15,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 20:46:17,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 20:46:17,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:46:17,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=484846.6666666667, ans=0.0 2023-09-29 20:46:19,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 20:46:20,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:20,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:46:23,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=484846.6666666667, ans=0.0 2023-09-29 20:46:28,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:46:28,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=484846.6666666667, ans=0.125 2023-09-29 20:46:30,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:30,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:46:31,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:46:33,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:46:33,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=484913.3333333333, ans=0.125 2023-09-29 20:46:35,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:46:41,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:42,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:42,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:42,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:46:44,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:45,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:46,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=484913.3333333333, ans=0.1 2023-09-29 20:46:49,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=484980.0, ans=0.125 2023-09-29 20:46:52,121 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.19 vs. limit=22.5 2023-09-29 20:46:52,657 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 20:46:54,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:54,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:56,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:46:56,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:46:57,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:46:59,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:01,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 20:47:01,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:04,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:47:05,618 INFO [train.py:1039] (2/4) Epoch 14, batch 3700, loss[loss=0.1946, simple_loss=0.266, pruned_loss=0.06159, over 23412.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2635, pruned_loss=0.05788, over 4724314.71 frames. ], batch size: 106, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:47:05,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:47:07,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:47:09,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:09,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 20:47:09,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:11,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:47:12,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:47:13,914 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.934e+02 2.131e+02 2.298e+02 2.848e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-29 20:47:14,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:47:17,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:47:18,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:18,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:47:18,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:19,592 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.82 vs. limit=15.0 2023-09-29 20:47:20,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:47:22,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:24,139 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 20:47:29,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=12.0 2023-09-29 20:47:33,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:47:33,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:47:35,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:47:35,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 20:47:35,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:37,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=485180.0, ans=0.0 2023-09-29 20:47:40,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:41,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 20:47:42,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:43,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:47:44,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=485180.0, ans=0.125 2023-09-29 20:47:47,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:49,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:47:50,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=485180.0, ans=0.0 2023-09-29 20:47:52,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:47:53,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:55,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 20:47:55,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:56,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 20:48:00,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:48:02,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:48:03,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=485246.6666666667, ans=0.125 2023-09-29 20:48:05,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:07,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 20:48:07,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=485246.6666666667, ans=0.1 2023-09-29 20:48:08,018 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.16 vs. limit=6.0 2023-09-29 20:48:08,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:48:08,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:48:10,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:10,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:13,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:14,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 20:48:16,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 20:48:18,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:48:18,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:18,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:48:20,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:48:23,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:48:24,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:48:26,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:48:27,890 INFO [train.py:1039] (2/4) Epoch 14, batch 3750, loss[loss=0.1935, simple_loss=0.2567, pruned_loss=0.06518, over 23835.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2644, pruned_loss=0.05885, over 4721142.48 frames. ], batch size: 195, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:48:29,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 20:48:31,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:48:34,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:48:34,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 20:48:35,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=485380.0, ans=0.125 2023-09-29 20:48:36,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:48:37,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:39,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:39,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:48:43,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:48:46,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:48:47,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:48:50,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:54,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:48:56,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 20:48:56,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:48:58,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:48:58,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:49:02,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 20:49:05,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 20:49:06,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=485513.3333333333, ans=0.2 2023-09-29 20:49:06,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=485513.3333333333, ans=0.1 2023-09-29 20:49:07,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:49:09,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:49:09,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:15,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:16,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:49:21,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 20:49:23,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:28,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:49:29,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:49:33,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:49:37,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:49:38,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=485646.6666666667, ans=0.125 2023-09-29 20:49:39,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:49:41,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:49:43,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:49:44,113 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.77 vs. limit=15.0 2023-09-29 20:49:46,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:49:51,124 INFO [train.py:1039] (2/4) Epoch 14, batch 3800, loss[loss=0.1724, simple_loss=0.2461, pruned_loss=0.04938, over 24293.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2645, pruned_loss=0.05846, over 4727242.32 frames. ], batch size: 61, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:49:54,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:49:57,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:59,284 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.914e+02 2.283e+02 2.549e+02 3.824e+02, threshold=4.565e+02, percent-clipped=0.0 2023-09-29 20:49:59,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:49:59,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 20:49:59,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=485713.3333333333, ans=0.0 2023-09-29 20:49:59,914 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=485713.3333333333, ans=0.1 2023-09-29 20:50:00,700 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.72 vs. limit=5.0 2023-09-29 20:50:01,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:04,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:50:08,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 20:50:08,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:09,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:50:12,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:12,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:50:12,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:12,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 20:50:18,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:50:19,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:50:22,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:25,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:50:25,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:50:29,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:50:29,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:32,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:32,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:37,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:50:37,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 20:50:39,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:50:42,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=485913.3333333333, ans=0.125 2023-09-29 20:50:47,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:50:54,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:50:55,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 20:50:57,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 20:50:57,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:00,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:51:00,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:02,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 20:51:05,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 20:51:05,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 20:51:05,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:07,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:51:13,730 INFO [train.py:1039] (2/4) Epoch 14, batch 3850, loss[loss=0.1759, simple_loss=0.2264, pruned_loss=0.06269, over 22740.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2624, pruned_loss=0.05793, over 4730668.61 frames. ], batch size: 322, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:51:15,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:51:16,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:51:20,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:51:20,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 20:51:22,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:51:23,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:26,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:51:30,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:31,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:51:34,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 20:51:40,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:42,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:44,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:46,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:51:50,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:50,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:51:50,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:50,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:51:52,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:54,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:55,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:55,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:51:57,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 20:51:59,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 20:51:59,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:59,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:02,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 20:52:05,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 20:52:07,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:09,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 20:52:12,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:52:18,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:19,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.76 vs. limit=22.5 2023-09-29 20:52:20,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:23,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:23,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 20:52:27,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 20:52:29,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:29,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:32,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:52:33,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:52:34,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:34,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=486313.3333333333, ans=0.125 2023-09-29 20:52:35,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:35,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:52:35,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 20:52:37,420 INFO [train.py:1039] (2/4) Epoch 14, batch 3900, loss[loss=0.197, simple_loss=0.2641, pruned_loss=0.06494, over 23240.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2606, pruned_loss=0.05732, over 4726507.20 frames. ], batch size: 105, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:52:37,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:52:37,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 20:52:39,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:39,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:39,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:52:40,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:41,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:52:42,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:42,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:42,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:52:44,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 20:52:44,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:47,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.944e+02 2.146e+02 2.547e+02 3.892e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 20:52:48,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:48,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:48,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:52:53,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:53,649 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=486446.6666666667, ans=0.125 2023-09-29 20:52:54,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:56,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:58,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:52:59,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 20:52:59,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:53:03,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 20:53:03,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:53:03,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 20:53:05,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 20:53:08,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:10,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:53:10,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:53:10,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:15,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:18,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:53:19,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:53:19,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:21,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:53:24,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=486513.3333333333, ans=0.2 2023-09-29 20:53:28,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:53:28,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:53:34,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=486580.0, ans=0.125 2023-09-29 20:53:37,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:53:38,538 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.03 vs. limit=12.0 2023-09-29 20:53:40,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:53:49,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:53:51,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:53,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 20:53:54,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 20:53:54,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:56,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 20:53:56,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:57,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 20:54:01,140 INFO [train.py:1039] (2/4) Epoch 14, batch 3950, loss[loss=0.1933, simple_loss=0.2564, pruned_loss=0.06506, over 18184.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2605, pruned_loss=0.05767, over 4700708.56 frames. ], batch size: 39, lr: 7.33e-03, grad_scale: 16.0 2023-09-29 20:54:05,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:54:08,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 20:54:08,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:54:08,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=486713.3333333333, ans=0.125 2023-09-29 20:54:09,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:54:11,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:54:18,275 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 20:54:18,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:19,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 20:54:19,877 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 20:54:19,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:54:21,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=486780.0, ans=0.125 2023-09-29 20:54:22,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:22,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:54:22,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:25,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 20:54:27,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:54:29,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:29,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:54:29,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:54:29,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:54:29,918 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.79 vs. limit=12.0 2023-09-29 20:54:32,318 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=12.79 vs. limit=15.0 2023-09-29 20:54:41,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:54:43,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:54:49,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 20:54:55,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 20:54:55,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 20:54:56,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:54:58,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:55:05,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:55:07,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:55:07,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:09,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:55:09,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 20:55:14,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:55:16,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:55:19,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 20:55:22,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=486980.0, ans=0.1 2023-09-29 20:55:25,241 INFO [train.py:1039] (2/4) Epoch 14, batch 4000, loss[loss=0.1876, simple_loss=0.2677, pruned_loss=0.05377, over 23942.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2609, pruned_loss=0.05753, over 4707269.68 frames. ], batch size: 86, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:55:29,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:31,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=487046.6666666667, ans=0.0 2023-09-29 20:55:34,256 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.864e+02 2.068e+02 2.328e+02 3.219e+02, threshold=4.135e+02, percent-clipped=0.0 2023-09-29 20:55:39,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:42,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:42,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=487113.3333333333, ans=0.0 2023-09-29 20:55:44,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:55:44,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:44,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 20:55:46,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:55:46,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 20:55:46,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:55:46,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 20:55:49,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:52,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:55:52,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:55:52,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:55:53,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:53,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:55:57,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:55:57,447 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 20:55:59,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:55:59,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:02,583 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 20:56:04,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:56:04,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:06,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=487180.0, ans=0.0 2023-09-29 20:56:10,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 20:56:10,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:56:11,127 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.15 vs. limit=12.0 2023-09-29 20:56:13,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:56:13,653 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 20:56:15,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:56:16,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 20:56:16,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:56:16,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:18,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:56:21,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:56:21,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:56:21,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:22,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 20:56:22,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:24,185 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 20:56:28,841 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.67 vs. limit=15.0 2023-09-29 20:56:30,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:56:32,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:56:34,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:56:36,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:36,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=487313.3333333333, ans=0.125 2023-09-29 20:56:37,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:56:39,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:56:43,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:46,881 INFO [train.py:1039] (2/4) Epoch 14, batch 4050, loss[loss=0.1781, simple_loss=0.2469, pruned_loss=0.05463, over 24441.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2616, pruned_loss=0.05806, over 4707669.80 frames. ], batch size: 58, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:56:46,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:56:47,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 20:56:48,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=487380.0, ans=0.0 2023-09-29 20:56:49,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:56:51,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:56:52,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:56:54,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:56:54,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:56:58,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:56:59,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=487380.0, ans=0.125 2023-09-29 20:57:01,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:02,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:57:03,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:57:03,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:57:09,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:10,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:57:12,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 20:57:13,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 20:57:14,023 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 20:57:15,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=487446.6666666667, ans=0.0 2023-09-29 20:57:17,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:57:17,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=487446.6666666667, ans=0.125 2023-09-29 20:57:17,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=487446.6666666667, ans=0.125 2023-09-29 20:57:23,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 20:57:23,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=487513.3333333333, ans=0.2 2023-09-29 20:57:27,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:57:30,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:33,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:33,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:57:33,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:37,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:40,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 20:57:40,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:57:41,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=487580.0, ans=15.0 2023-09-29 20:57:42,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:46,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 20:57:50,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:59,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 20:57:59,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:01,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:58:02,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 20:58:02,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 20:58:02,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:04,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:06,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:06,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:58:10,376 INFO [train.py:1039] (2/4) Epoch 14, batch 4100, loss[loss=0.1923, simple_loss=0.2558, pruned_loss=0.0644, over 23802.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2631, pruned_loss=0.0594, over 4704907.99 frames. ], batch size: 212, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:58:14,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 20:58:15,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 20:58:18,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 20:58:20,528 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.945e+02 2.209e+02 2.502e+02 4.292e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 20:58:20,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 20:58:20,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:20,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:20,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:22,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:58:22,373 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 20:58:22,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=487713.3333333333, ans=0.125 2023-09-29 20:58:24,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:25,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:58:25,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:58:32,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:58:33,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:33,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:58:33,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 20:58:35,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:35,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:58:35,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:37,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:58:37,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 20:58:38,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:58:40,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 20:58:42,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:45,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:45,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 20:58:45,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:47,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:58:47,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:58:50,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 20:58:52,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:58:54,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:58:55,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 20:58:55,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:57,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:58:59,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=487913.3333333333, ans=0.1 2023-09-29 20:59:00,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:02,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=487913.3333333333, ans=0.1 2023-09-29 20:59:05,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:10,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:10,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:59:20,645 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.48 vs. limit=12.0 2023-09-29 20:59:21,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:21,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:22,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=487980.0, ans=0.2 2023-09-29 20:59:25,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:28,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:59:33,281 INFO [train.py:1039] (2/4) Epoch 14, batch 4150, loss[loss=0.1904, simple_loss=0.2718, pruned_loss=0.05452, over 23916.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2637, pruned_loss=0.05961, over 4708453.11 frames. ], batch size: 80, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 20:59:33,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:59:34,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:59:35,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:59:35,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:37,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=488046.6666666667, ans=0.125 2023-09-29 20:59:38,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 20:59:38,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:40,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 20:59:41,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 20:59:41,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 20:59:43,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:48,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:59:48,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:53,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:59:54,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:59:56,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:59:57,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:59:57,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:59,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:00:04,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:09,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:09,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 21:00:12,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 21:00:12,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:00:13,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 21:00:13,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:00:14,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:17,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:17,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:20,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=488180.0, ans=0.2 2023-09-29 21:00:23,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 21:00:26,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:29,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:00:29,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 21:00:30,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:32,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 21:00:34,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:00:37,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:37,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:39,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 21:00:39,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:00:39,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:00:39,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=488313.3333333333, ans=0.0 2023-09-29 21:00:41,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:00:44,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 21:00:44,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=488313.3333333333, ans=0.2 2023-09-29 21:00:44,578 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:00:45,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:45,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:00:45,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:00:45,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 21:00:45,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:47,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 21:00:48,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:50,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:50,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 21:00:50,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:55,489 INFO [train.py:1039] (2/4) Epoch 14, batch 4200, loss[loss=0.1841, simple_loss=0.2679, pruned_loss=0.05014, over 24639.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2628, pruned_loss=0.05916, over 4705659.56 frames. ], batch size: 73, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:00:55,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:00:59,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 21:01:00,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:01:03,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:05,340 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.014e+02 2.292e+02 2.781e+02 4.764e+02, threshold=4.585e+02, percent-clipped=1.0 2023-09-29 21:01:05,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:01:05,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:05,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:09,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 21:01:09,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=488380.0, ans=0.1 2023-09-29 21:01:11,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 21:01:12,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:14,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:18,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=488446.6666666667, ans=0.07 2023-09-29 21:01:19,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:01:22,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:01:22,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:23,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:23,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 21:01:23,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:25,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:25,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:25,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:01:27,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:01:28,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 21:01:28,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:31,825 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.33 vs. limit=15.0 2023-09-29 21:01:33,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:01:33,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=488513.3333333333, ans=0.0 2023-09-29 21:01:34,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:01:37,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:01:39,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:01:42,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:01:42,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 21:01:42,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:01:43,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:01:44,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=488580.0, ans=0.125 2023-09-29 21:01:48,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:01:48,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=488580.0, ans=0.2 2023-09-29 21:01:51,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:56,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=488580.0, ans=0.07 2023-09-29 21:01:57,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:01:59,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 21:02:02,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:09,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:02:09,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:09,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=488646.6666666667, ans=0.125 2023-09-29 21:02:10,226 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.76 vs. limit=6.0 2023-09-29 21:02:11,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.34 vs. limit=15.0 2023-09-29 21:02:12,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 21:02:12,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=488646.6666666667, ans=0.125 2023-09-29 21:02:12,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=488646.6666666667, ans=0.125 2023-09-29 21:02:17,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:02:19,035 INFO [train.py:1039] (2/4) Epoch 14, batch 4250, loss[loss=0.1979, simple_loss=0.2817, pruned_loss=0.05708, over 24659.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2615, pruned_loss=0.05885, over 4707218.27 frames. ], batch size: 73, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:02:22,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:02:22,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:02:23,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=488713.3333333333, ans=0.2 2023-09-29 21:02:23,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.75 vs. limit=22.5 2023-09-29 21:02:24,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=488713.3333333333, ans=0.2 2023-09-29 21:02:25,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:30,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:02:31,177 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.44 vs. limit=22.5 2023-09-29 21:02:32,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 21:02:32,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:02:34,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:38,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:02:44,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:45,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:46,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:02:46,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:02:50,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:51,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:53,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:55,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:02:57,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:58,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 21:03:03,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 21:03:03,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:03,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:03,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:03:04,278 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=12.0 2023-09-29 21:03:06,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:03:06,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:06,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:08,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:03:11,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:03:15,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:17,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:18,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 21:03:18,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:03:18,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 21:03:20,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:03:22,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:03:23,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:23,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:03:24,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=488980.0, ans=0.125 2023-09-29 21:03:24,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=488980.0, ans=0.0 2023-09-29 21:03:25,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 21:03:27,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:03:27,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:03:32,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:35,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:36,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:03:38,361 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.94 vs. limit=15.0 2023-09-29 21:03:39,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:40,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:42,507 INFO [train.py:1039] (2/4) Epoch 14, batch 4300, loss[loss=0.1712, simple_loss=0.2587, pruned_loss=0.04183, over 24306.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2609, pruned_loss=0.05844, over 4705507.22 frames. ], batch size: 74, lr: 7.32e-03, grad_scale: 16.0 2023-09-29 21:03:42,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:03:44,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:03:44,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 21:03:45,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:50,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:50,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:03:53,187 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.403e+02 1.977e+02 2.365e+02 3.006e+02 5.319e+02, threshold=4.729e+02, percent-clipped=1.0 2023-09-29 21:03:57,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:58,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=489113.3333333333, ans=0.09899494936611666 2023-09-29 21:04:04,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:04:04,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 21:04:06,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:04:09,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:04:09,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:04:09,123 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 21:04:10,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:04:12,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:15,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 21:04:15,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:04:17,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 21:04:20,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:04:22,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:04:22,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:04:23,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:04:25,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:04:25,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=489180.0, ans=0.125 2023-09-29 21:04:27,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:28,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:04:28,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 21:04:28,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 21:04:32,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:04:34,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:34,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:04:34,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:36,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:36,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 21:04:36,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 21:04:38,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 21:04:38,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=489246.6666666667, ans=0.0 2023-09-29 21:04:40,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:04:40,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 21:04:41,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 21:04:46,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:47,727 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 21:04:47,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:04:49,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:04:49,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:51,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 21:04:52,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:52,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:53,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:04:53,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:04:54,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:04:57,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:04:58,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:04:59,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:01,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:05:05,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=489380.0, ans=0.125 2023-09-29 21:05:06,038 INFO [train.py:1039] (2/4) Epoch 14, batch 4350, loss[loss=0.2071, simple_loss=0.2749, pruned_loss=0.06968, over 23475.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2615, pruned_loss=0.05833, over 4703751.59 frames. ], batch size: 134, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:05:06,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=489380.0, ans=0.1 2023-09-29 21:05:07,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 21:05:07,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:05:13,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:16,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:19,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:05:19,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:05:25,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:05:26,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:27,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=489446.6666666667, ans=0.125 2023-09-29 21:05:30,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:05:30,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:05:34,076 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=12.0 2023-09-29 21:05:35,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:05:36,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:05:38,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:05:44,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 21:05:44,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:46,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:50,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:53,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 21:05:56,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:05:56,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:05:56,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=489580.0, ans=0.1 2023-09-29 21:06:01,663 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 21:06:03,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:06:04,782 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 21:06:06,288 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 21:06:06,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:06,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:07,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:06:07,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:09,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:09,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:12,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=15.0 2023-09-29 21:06:13,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 21:06:13,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:13,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:13,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:14,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 21:06:16,067 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 21:06:16,074 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 21:06:16,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 21:06:18,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:06:20,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:06:20,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:20,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:06:23,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 21:06:26,126 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 21:06:26,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:27,562 INFO [train.py:1039] (2/4) Epoch 14, batch 4400, loss[loss=0.2114, simple_loss=0.2736, pruned_loss=0.07465, over 23697.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2629, pruned_loss=0.05913, over 4702201.12 frames. ], batch size: 120, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:06:29,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:29,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:32,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:35,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 21:06:35,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 21:06:37,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 21:06:37,329 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 21:06:37,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:06:37,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:38,942 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.905e+02 2.228e+02 2.642e+02 4.473e+02, threshold=4.456e+02, percent-clipped=0.0 2023-09-29 21:06:40,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 21:06:42,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:43,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:43,801 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 21:06:47,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:47,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 21:06:47,547 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 21:06:49,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.90 vs. limit=22.5 2023-09-29 21:06:50,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 21:06:52,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 21:06:52,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 21:06:52,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:54,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:54,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:56,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:59,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 21:06:59,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 21:06:59,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:02,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:07:02,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:04,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:05,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:05,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 21:07:07,004 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 21:07:10,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:15,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=489913.3333333333, ans=0.1 2023-09-29 21:07:16,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:07:19,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 21:07:20,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=489913.3333333333, ans=0.125 2023-09-29 21:07:23,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:07:26,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:28,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:07:30,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 21:07:30,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:07:30,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:07:30,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:07:31,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:07:37,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 21:07:39,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 21:07:40,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 21:07:40,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:40,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 21:07:42,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:07:45,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:07:47,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 21:07:49,037 INFO [train.py:1039] (2/4) Epoch 14, batch 4450, loss[loss=0.1677, simple_loss=0.2452, pruned_loss=0.04514, over 24608.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2634, pruned_loss=0.05891, over 4703250.87 frames. ], batch size: 60, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:07:49,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=490046.6666666667, ans=0.2 2023-09-29 21:07:50,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:53,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:55,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:07:55,932 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.74 vs. limit=6.0 2023-09-29 21:08:01,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:02,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:08:05,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:07,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:08:10,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:08:12,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:13,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 21:08:13,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:13,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:13,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:08:13,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:08:16,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:08:23,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:23,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:25,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:26,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:27,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:08:31,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=490180.0, ans=0.0 2023-09-29 21:08:33,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:08:35,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 21:08:35,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 21:08:35,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:08:38,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:40,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 21:08:41,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=490246.6666666667, ans=0.125 2023-09-29 21:08:44,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:08:49,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:49,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 21:08:49,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:49,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:08:49,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:08:49,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:52,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:56,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:08:57,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 21:08:59,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:08:59,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:02,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:09:02,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:02,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:09:06,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:09:07,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=490313.3333333333, ans=0.1 2023-09-29 21:09:09,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=490380.0, ans=0.035 2023-09-29 21:09:10,410 INFO [train.py:1039] (2/4) Epoch 14, batch 4500, loss[loss=0.1905, simple_loss=0.244, pruned_loss=0.06848, over 22720.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2634, pruned_loss=0.05897, over 4721710.88 frames. ], batch size: 322, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:09:10,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 21:09:10,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:09:17,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:17,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 21:09:17,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 21:09:19,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:23,948 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.876e+02 2.126e+02 2.360e+02 4.104e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 21:09:24,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:25,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:25,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:09:27,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:09:27,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:27,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:41,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:42,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:09:45,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:45,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:09:47,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:09:50,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.34 vs. limit=12.0 2023-09-29 21:09:53,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:09:59,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:10:00,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:10:05,067 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:10:06,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:10:06,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 21:10:07,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:07,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:11,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:11,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:10:14,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:10:14,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 21:10:14,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:10:14,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:19,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:10:19,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=490646.6666666667, ans=0.1 2023-09-29 21:10:20,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:10:22,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:24,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:10:26,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:10:27,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 21:10:29,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 21:10:29,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 21:10:32,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 21:10:34,055 INFO [train.py:1039] (2/4) Epoch 14, batch 4550, loss[loss=0.1966, simple_loss=0.2766, pruned_loss=0.05832, over 23485.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2627, pruned_loss=0.05855, over 4717319.24 frames. ], batch size: 93, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:10:36,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 21:10:36,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:10:39,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:41,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:45,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:49,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:10:52,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:52,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:10:52,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:10:52,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:52,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=490780.0, ans=0.125 2023-09-29 21:10:55,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:57,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:11:00,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=490780.0, ans=0.125 2023-09-29 21:11:01,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:02,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=490780.0, ans=0.125 2023-09-29 21:11:03,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 21:11:04,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 21:11:06,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:11:07,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 21:11:08,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=490846.6666666667, ans=0.125 2023-09-29 21:11:09,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 21:11:11,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:13,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 21:11:15,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:11:18,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:18,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:19,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:11:21,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 21:11:25,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:27,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:27,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:29,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:31,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 21:11:32,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 21:11:32,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:11:32,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 21:11:36,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 21:11:36,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:38,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:11:38,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:39,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:39,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:11:42,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:11:42,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 21:11:44,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:44,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:11:46,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 21:11:46,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:11:46,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 21:11:51,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:11:51,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:11:54,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:11:54,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:11:57,358 INFO [train.py:1039] (2/4) Epoch 14, batch 4600, loss[loss=0.1869, simple_loss=0.2649, pruned_loss=0.05441, over 24346.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2617, pruned_loss=0.05806, over 4709632.96 frames. ], batch size: 77, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:11:57,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:11:57,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:11:57,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=491046.6666666667, ans=0.125 2023-09-29 21:12:02,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:03,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:12:04,337 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.87 vs. limit=12.0 2023-09-29 21:12:07,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:12:07,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:12:08,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:08,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 21:12:11,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:12:12,507 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.889e+02 2.188e+02 2.520e+02 3.712e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 21:12:14,936 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.27 vs. limit=15.0 2023-09-29 21:12:15,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:12:15,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:17,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:27,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 21:12:27,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:30,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:32,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=491180.0, ans=0.0 2023-09-29 21:12:34,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:12:34,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:38,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 21:12:38,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:12:38,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:12:44,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:44,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:12:46,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:12:47,195 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.32 vs. limit=22.5 2023-09-29 21:12:50,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 21:12:52,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:12:57,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:12:58,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:01,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:01,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 21:13:01,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:02,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 21:13:02,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:02,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:05,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:05,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:13:07,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:07,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 21:13:07,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 21:13:09,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 21:13:09,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:09,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:11,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:11,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:20,988 INFO [train.py:1039] (2/4) Epoch 14, batch 4650, loss[loss=0.1977, simple_loss=0.2692, pruned_loss=0.06312, over 24325.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2605, pruned_loss=0.05755, over 4704159.23 frames. ], batch size: 61, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:13:24,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:13:27,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:28,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:28,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:13:28,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:30,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:30,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:34,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 21:13:39,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:13:40,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 21:13:42,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:42,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 21:13:42,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:13:44,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 21:13:44,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 21:13:44,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:44,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:13:48,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:13:49,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:49,592 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 21:13:53,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:56,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 21:13:59,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:00,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:14:00,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 21:14:02,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:14:06,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:14:09,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:14,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:17,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:17,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:19,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:14:20,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 21:14:22,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 21:14:23,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 21:14:23,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 21:14:24,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:30,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.58 vs. limit=15.0 2023-09-29 21:14:31,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:14:31,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:14:31,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 21:14:32,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:34,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:34,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:14:36,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:14:37,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:14:37,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:39,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:43,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:44,414 INFO [train.py:1039] (2/4) Epoch 14, batch 4700, loss[loss=0.2041, simple_loss=0.2758, pruned_loss=0.06616, over 23285.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2614, pruned_loss=0.05778, over 4704709.14 frames. ], batch size: 93, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:14:44,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:14:44,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:14:44,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:14:46,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:14:48,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 21:14:56,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:58,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:59,753 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.446e+02 1.978e+02 2.336e+02 2.752e+02 4.215e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 21:14:59,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:00,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:02,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:15:08,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 21:15:08,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 21:15:09,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:11,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:15:11,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:15:15,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:21,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:15:23,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:15:25,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:31,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 21:15:32,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:15:36,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:38,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 21:15:40,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:15:43,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:15:45,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 21:15:46,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:46,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:51,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:51,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:15:51,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 21:15:54,064 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 21:15:55,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:55,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:55,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:55,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 21:15:59,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:16:02,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 21:16:05,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:16:08,498 INFO [train.py:1039] (2/4) Epoch 14, batch 4750, loss[loss=0.1819, simple_loss=0.2693, pruned_loss=0.04727, over 24322.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2623, pruned_loss=0.05825, over 4703292.61 frames. ], batch size: 74, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:16:08,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:13,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:13,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:16:15,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 21:16:15,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:18,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 21:16:20,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:16:21,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:16:22,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:27,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 21:16:32,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:16:35,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 21:16:35,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:38,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:38,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:40,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:42,293 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 21:16:42,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 21:16:48,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 21:16:50,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:52,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:16:55,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=492180.0, ans=0.07 2023-09-29 21:16:56,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:16:56,473 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 21:16:56,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:16:58,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:17:01,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:17:03,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=492246.6666666667, ans=0.125 2023-09-29 21:17:04,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 21:17:04,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 21:17:04,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:17:04,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:17:04,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:06,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:17:08,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 21:17:10,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 21:17:11,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:16,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:17:16,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 21:17:18,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:19,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:17:21,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:17:23,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:23,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:17:23,806 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:17:26,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:26,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 21:17:28,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 21:17:29,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 21:17:30,934 INFO [train.py:1039] (2/4) Epoch 14, batch 4800, loss[loss=0.1931, simple_loss=0.2671, pruned_loss=0.05957, over 23161.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2635, pruned_loss=0.05899, over 4713254.28 frames. ], batch size: 93, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:17:33,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:17:34,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:36,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 21:17:40,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:42,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:42,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=492380.0, ans=0.2 2023-09-29 21:17:45,639 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.984e+02 2.307e+02 2.840e+02 4.511e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 21:17:47,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:17:48,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:50,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:50,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 21:17:50,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:51,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:17:53,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:17:58,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:00,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:00,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:18:02,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:02,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:18:02,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:03,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:07,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:10,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:18:13,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:18:13,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=492513.3333333333, ans=0.0 2023-09-29 21:18:14,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:16,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 21:18:16,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 21:18:18,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:19,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:18:19,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:18:19,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:19,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:18:21,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:18:21,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:26,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:30,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:31,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:37,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 21:18:38,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:38,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:38,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:18:38,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:43,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:44,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=492646.6666666667, ans=0.1 2023-09-29 21:18:45,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:18:45,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:46,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:18:47,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:18:47,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:18:48,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=492646.6666666667, ans=0.2 2023-09-29 21:18:51,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:51,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:51,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:53,060 INFO [train.py:1039] (2/4) Epoch 14, batch 4850, loss[loss=0.1888, simple_loss=0.2791, pruned_loss=0.04924, over 24608.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2648, pruned_loss=0.05907, over 4713643.57 frames. ], batch size: 68, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:18:53,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 21:18:54,154 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-09-29 21:18:56,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 21:18:56,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:18:56,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:00,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:19:07,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 21:19:09,843 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.02 vs. limit=15.0 2023-09-29 21:19:10,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:13,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:15,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:19:15,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:19,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:20,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:19:20,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=492780.0, ans=0.125 2023-09-29 21:19:22,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:19:22,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 21:19:26,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:19:28,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:19:28,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:19:29,048 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.29 vs. limit=22.5 2023-09-29 21:19:30,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:19:30,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 21:19:32,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=492846.6666666667, ans=0.125 2023-09-29 21:19:33,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:33,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:38,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:38,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 21:19:40,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 21:19:40,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:19:47,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:19:48,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 21:19:50,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:19:50,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:19:53,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:19:54,188 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=492913.3333333333, ans=0.05 2023-09-29 21:19:55,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 21:19:55,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:55,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 21:19:55,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:57,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:19:57,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=492913.3333333333, ans=0.125 2023-09-29 21:19:57,619 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.33 vs. limit=10.0 2023-09-29 21:19:58,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 21:20:07,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:13,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:20:13,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:17,050 INFO [train.py:1039] (2/4) Epoch 14, batch 4900, loss[loss=0.1791, simple_loss=0.2337, pruned_loss=0.06228, over 23420.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2641, pruned_loss=0.05906, over 4712186.60 frames. ], batch size: 285, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:20:18,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 21:20:18,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:20:23,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:23,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=493046.6666666667, ans=0.0 2023-09-29 21:20:25,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:25,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:20:26,440 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.06 vs. limit=15.0 2023-09-29 21:20:30,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 21:20:31,618 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.872e+02 2.087e+02 2.309e+02 3.318e+02, threshold=4.174e+02, percent-clipped=0.0 2023-09-29 21:20:33,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 21:20:37,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 21:20:38,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 21:20:40,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:40,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:40,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:20:40,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:40,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:20:40,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=493113.3333333333, ans=0.0 2023-09-29 21:20:41,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 21:20:48,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 21:20:48,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:20:48,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:20:50,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:52,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:20:53,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:55,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:55,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 21:20:56,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:20:58,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:58,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 21:20:58,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 21:21:04,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 21:21:06,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:21:07,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:07,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:21:08,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:08,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:21:09,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:21:09,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 21:21:11,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:12,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:21:14,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:21:19,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 21:21:21,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:21:21,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:21:23,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 21:21:28,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:31,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:21:32,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 21:21:33,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:33,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:21:36,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:39,479 INFO [train.py:1039] (2/4) Epoch 14, batch 4950, loss[loss=0.1524, simple_loss=0.2276, pruned_loss=0.03855, over 24454.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2625, pruned_loss=0.05809, over 4715219.09 frames. ], batch size: 58, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:21:39,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:21:39,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:21:39,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:39,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 21:21:42,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:21:44,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:21:45,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:49,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 21:21:49,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 21:21:49,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:21:49,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 21:21:49,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:49,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:51,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:21:51,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:21:54,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:55,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:21:55,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:21:57,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:22:00,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:00,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:22:05,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:22:10,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:10,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:22:12,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:13,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:14,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=493513.3333333333, ans=0.2 2023-09-29 21:22:15,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:22:16,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 21:22:16,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 21:22:19,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:22,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:22:22,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:22:23,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:22:23,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:22:25,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:22:26,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:29,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:22:32,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:22:34,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:34,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:36,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 21:22:36,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:22:38,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:22:41,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:22:43,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:22:43,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:22:45,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:45,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:22:46,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:22:48,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:22:48,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=493646.6666666667, ans=0.07 2023-09-29 21:22:49,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:22:49,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:51,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 21:22:53,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=493646.6666666667, ans=0.0 2023-09-29 21:22:54,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:00,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=493713.3333333333, ans=0.0 2023-09-29 21:23:01,161 INFO [train.py:1039] (2/4) Epoch 14, batch 5000, loss[loss=0.1733, simple_loss=0.246, pruned_loss=0.05027, over 20412.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2612, pruned_loss=0.05759, over 4710993.04 frames. ], batch size: 44, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:23:01,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 21:23:01,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:23:05,328 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.71 vs. limit=22.5 2023-09-29 21:23:06,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:06,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:09,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 21:23:09,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 21:23:11,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:23:14,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 21:23:14,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:23:14,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:23:16,185 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.874e+02 2.097e+02 2.409e+02 3.545e+02, threshold=4.194e+02, percent-clipped=0.0 2023-09-29 21:23:16,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 21:23:16,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:17,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:19,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 21:23:19,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:19,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:21,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.94 vs. limit=10.0 2023-09-29 21:23:22,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 21:23:22,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 21:23:22,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:23:22,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=493780.0, ans=0.0 2023-09-29 21:23:23,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 21:23:23,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:23:24,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:25,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:23:25,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 21:23:25,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 21:23:28,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 21:23:28,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:30,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:30,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 21:23:30,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:33,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:34,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.66 vs. limit=15.0 2023-09-29 21:23:35,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:35,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:23:35,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 21:23:35,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:23:38,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:23:40,780 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 21:23:46,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:46,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:46,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:23:46,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=493846.6666666667, ans=0.125 2023-09-29 21:23:51,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 21:23:51,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:51,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:51,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:53,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=493913.3333333333, ans=0.0 2023-09-29 21:23:54,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 21:23:54,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:57,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:57,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:04,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 21:24:06,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=493980.0, ans=0.125 2023-09-29 21:24:09,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:17,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=493980.0, ans=0.125 2023-09-29 21:24:18,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:24:21,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:21,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:24:22,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:22,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:24:22,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:24:22,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:23,828 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.30 vs. limit=15.0 2023-09-29 21:24:24,190 INFO [train.py:1039] (2/4) Epoch 14, batch 5050, loss[loss=0.1655, simple_loss=0.2424, pruned_loss=0.04434, over 24514.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2611, pruned_loss=0.05769, over 4709477.91 frames. ], batch size: 66, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:24:28,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:28,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 21:24:30,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=494046.6666666667, ans=0.2 2023-09-29 21:24:31,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:24:34,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:34,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:24:36,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 21:24:37,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=494046.6666666667, ans=0.0 2023-09-29 21:24:38,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:38,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:24:40,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:24:42,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:24:42,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:24:44,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=494113.3333333333, ans=0.2 2023-09-29 21:24:51,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 21:24:51,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:24:53,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:24:53,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 21:24:55,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:24:56,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:24:58,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:59,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:24:59,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 21:24:59,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=494180.0, ans=10.0 2023-09-29 21:25:00,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 21:25:01,452 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.28 vs. limit=10.0 2023-09-29 21:25:01,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:03,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:06,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:08,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 21:25:09,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:13,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 21:25:13,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:25:13,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=494246.6666666667, ans=0.125 2023-09-29 21:25:14,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:25:14,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:16,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:25:18,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:25:20,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:25:21,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:21,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:25:21,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:25:23,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 21:25:24,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:25:26,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:25:31,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:31,405 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 21:25:31,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:25:33,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:25:33,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:33,647 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 21:25:36,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:36,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 21:25:36,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:41,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:41,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:41,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=494313.3333333333, ans=0.125 2023-09-29 21:25:42,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 21:25:43,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=494313.3333333333, ans=0.125 2023-09-29 21:25:44,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 21:25:44,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=494380.0, ans=0.125 2023-09-29 21:25:46,463 INFO [train.py:1039] (2/4) Epoch 14, batch 5100, loss[loss=0.1728, simple_loss=0.2487, pruned_loss=0.04848, over 24526.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2617, pruned_loss=0.0575, over 4722590.72 frames. ], batch size: 66, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:25:48,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:25:48,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:25:48,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:25:51,279 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 21:25:53,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=494380.0, ans=10.0 2023-09-29 21:25:54,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:55,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=494380.0, ans=0.2 2023-09-29 21:25:57,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 21:25:58,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=494380.0, ans=0.0 2023-09-29 21:25:59,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 21:25:59,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:00,834 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.801e+02 1.996e+02 2.340e+02 4.098e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-29 21:26:01,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:26:01,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=494446.6666666667, ans=0.125 2023-09-29 21:26:04,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:26:04,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 21:26:04,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 21:26:10,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=494446.6666666667, ans=0.125 2023-09-29 21:26:11,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:26:11,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:26:14,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=494446.6666666667, ans=0.125 2023-09-29 21:26:15,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:18,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 21:26:18,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:21,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:26:21,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:26:25,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 21:26:28,550 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 21:26:28,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:28,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 21:26:28,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 21:26:29,041 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=494513.3333333333, ans=0.125 2023-09-29 21:26:32,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:40,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:26:42,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 21:26:42,502 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 21:26:42,518 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 21:26:45,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 21:26:45,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:47,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 21:26:51,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 21:26:53,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:26:54,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=494646.6666666667, ans=0.125 2023-09-29 21:26:55,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:26:55,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=494646.6666666667, ans=0.0 2023-09-29 21:26:58,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 21:26:58,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:27:00,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 21:27:05,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:27:05,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:27:05,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:27:05,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:27:05,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:27:07,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:27:08,850 INFO [train.py:1039] (2/4) Epoch 14, batch 5150, loss[loss=0.2112, simple_loss=0.2706, pruned_loss=0.07592, over 23397.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2635, pruned_loss=0.05871, over 4721092.21 frames. ], batch size: 285, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:27:08,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 21:27:08,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 21:27:10,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 21:27:10,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:27:10,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 21:27:11,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:12,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:27:15,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:15,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:20,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:27:21,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 21:27:21,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:23,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:27:23,976 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=494780.0, ans=0.07 2023-09-29 21:27:25,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:27:25,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:25,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:26,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:27:26,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:27:26,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 21:27:30,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:27:30,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:27:32,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:27:34,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 21:27:35,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:27:42,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:27:43,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 21:27:45,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=494846.6666666667, ans=0.125 2023-09-29 21:27:48,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:53,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:55,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:57,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=494913.3333333333, ans=0.125 2023-09-29 21:28:00,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:00,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:05,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 21:28:10,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:28:10,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=494913.3333333333, ans=0.1 2023-09-29 21:28:11,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:28:11,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:28:14,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:16,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:18,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 21:28:21,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=494980.0, ans=0.0 2023-09-29 21:28:21,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=494980.0, ans=0.0 2023-09-29 21:28:22,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:28:24,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:28:28,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:28:28,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:28:29,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:28:29,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:28:29,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:28:29,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:28:31,107 INFO [train.py:1039] (2/4) Epoch 14, batch 5200, loss[loss=0.1691, simple_loss=0.2531, pruned_loss=0.04258, over 24669.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2644, pruned_loss=0.0586, over 4721304.01 frames. ], batch size: 68, lr: 7.27e-03, grad_scale: 32.0 2023-09-29 21:28:31,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=495046.6666666667, ans=0.1 2023-09-29 21:28:32,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:28:33,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:28:36,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:36,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=495046.6666666667, ans=0.1 2023-09-29 21:28:40,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 21:28:41,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:28:43,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:45,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:45,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:28:45,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:48,046 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.902e+02 2.088e+02 2.453e+02 3.691e+02, threshold=4.175e+02, percent-clipped=0.0 2023-09-29 21:28:48,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 21:28:49,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:28:51,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:54,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 21:28:57,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:28:59,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:29:01,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 21:29:01,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 21:29:04,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 21:29:04,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:04,513 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 21:29:04,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:29:07,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:07,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:29:09,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 21:29:10,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:29:12,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:16,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 21:29:16,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 21:29:16,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 21:29:21,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 21:29:22,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:29:27,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:29:27,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:29,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 21:29:31,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:31,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:29:31,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:31,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:29:35,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:39,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:29:43,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:43,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:29:43,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:47,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=495313.3333333333, ans=0.1 2023-09-29 21:29:47,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=495313.3333333333, ans=0.125 2023-09-29 21:29:50,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:52,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 21:29:52,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:52,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:29:53,788 INFO [train.py:1039] (2/4) Epoch 14, batch 5250, loss[loss=0.1847, simple_loss=0.2713, pruned_loss=0.04902, over 24431.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2634, pruned_loss=0.05878, over 4710139.28 frames. ], batch size: 69, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:29:54,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:54,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:29:54,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=495380.0, ans=0.125 2023-09-29 21:29:57,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:29:59,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:30:00,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:01,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:30:01,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=495380.0, ans=0.0 2023-09-29 21:30:02,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:30:06,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=495380.0, ans=0.125 2023-09-29 21:30:07,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:30:10,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:30:12,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:30:15,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:30:16,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=495446.6666666667, ans=0.1 2023-09-29 21:30:17,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 21:30:17,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:17,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:30:38,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=495513.3333333333, ans=0.125 2023-09-29 21:30:38,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=495513.3333333333, ans=0.125 2023-09-29 21:30:49,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=495580.0, ans=0.1 2023-09-29 21:31:08,507 INFO [train.py:1039] (2/4) Epoch 14, batch 5300, loss[loss=0.1631, simple_loss=0.2093, pruned_loss=0.0585, over 19316.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2607, pruned_loss=0.05817, over 4690474.53 frames. ], batch size: 388, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:31:09,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.76 vs. limit=15.0 2023-09-29 21:31:22,581 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.904e+02 2.089e+02 2.457e+02 4.761e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-29 21:31:25,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:31:25,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 21:31:25,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 21:31:25,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:26,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:26,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:26,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:26,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:26,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:31:26,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:26,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:31:27,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:31:27,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 21:31:27,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 21:31:27,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 21:31:27,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:31:27,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 21:31:27,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 21:31:28,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:28,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:28,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:28,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:29,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:31:29,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:29,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:29,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:29,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:29,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:29,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:31:29,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:29,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:31:30,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 21:31:30,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:31,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:31,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 21:31:31,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 21:31:32,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:31:32,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:31:32,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 21:31:32,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 21:31:32,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:33,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:31:33,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:33,461 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 21:31:33,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 21:31:33,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:31:33,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:33,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 21:31:33,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 21:31:34,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 21:31:34,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:43,606 INFO [train.py:1039] (2/4) Epoch 15, batch 0, loss[loss=0.174, simple_loss=0.2568, pruned_loss=0.04562, over 24495.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2568, pruned_loss=0.04562, over 24495.00 frames. ], batch size: 66, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:31:43,607 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 21:31:58,974 INFO [train.py:1071] (2/4) Epoch 15, validation: loss=0.2846, simple_loss=0.2783, pruned_loss=0.1455, over 1125622.00 frames. 2023-09-29 21:31:58,975 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-29 21:32:02,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 21:32:06,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:32:07,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:32:11,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:11,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:32:11,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:11,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=495800.0, ans=0.2 2023-09-29 21:32:12,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 21:32:14,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 21:32:15,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.99 vs. limit=12.0 2023-09-29 21:32:17,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:18,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:21,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:23,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:23,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:32:23,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:25,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 21:32:27,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:34,026 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.81 vs. limit=15.0 2023-09-29 21:32:35,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:32:35,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:38,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=495933.3333333333, ans=0.125 2023-09-29 21:32:38,223 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.93 vs. limit=22.5 2023-09-29 21:32:38,644 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.41 vs. limit=15.0 2023-09-29 21:32:39,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 21:32:41,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=495933.3333333333, ans=0.1 2023-09-29 21:32:44,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:32:44,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:32:45,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:50,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:32:53,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:55,663 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:32:58,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 21:32:59,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=496000.0, ans=0.1 2023-09-29 21:33:02,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 21:33:03,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:03,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:04,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:33:05,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:06,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 21:33:11,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:11,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:16,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:33:17,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=496066.6666666667, ans=0.125 2023-09-29 21:33:18,791 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 21:33:21,688 INFO [train.py:1039] (2/4) Epoch 15, batch 50, loss[loss=0.1973, simple_loss=0.2789, pruned_loss=0.05783, over 24326.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2666, pruned_loss=0.06133, over 1059934.27 frames. ], batch size: 74, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:33:21,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:33:24,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:26,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:26,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 21:33:27,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:33:27,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:33:29,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:29,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=496133.3333333333, ans=0.125 2023-09-29 21:33:32,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:33,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:37,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 21:33:37,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:42,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:33:44,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 21:33:46,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 21:33:48,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:33:49,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:33:49,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:50,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:52,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:33:52,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:33:52,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:59,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:00,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:00,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:34:02,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 21:34:04,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:34:05,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:34:05,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 21:34:07,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:10,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 21:34:17,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:34:19,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:19,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:21,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:21,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:25,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 21:34:25,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 21:34:28,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:28,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:28,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=496400.0, ans=0.1 2023-09-29 21:34:31,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:34:31,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:32,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 21:34:32,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 21:34:34,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:34:35,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:35,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:34:37,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 21:34:37,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 21:34:37,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:37,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:39,136 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.074e+02 2.565e+02 3.305e+02 5.603e+02, threshold=5.131e+02, percent-clipped=8.0 2023-09-29 21:34:40,182 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.55 vs. limit=15.0 2023-09-29 21:34:40,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:34:40,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:34:43,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:34:44,222 INFO [train.py:1039] (2/4) Epoch 15, batch 100, loss[loss=0.2157, simple_loss=0.2767, pruned_loss=0.07738, over 22927.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.265, pruned_loss=0.06041, over 1870424.71 frames. ], batch size: 322, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:34:44,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=496466.6666666667, ans=0.125 2023-09-29 21:34:45,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:34:46,652 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.25 vs. limit=15.0 2023-09-29 21:34:50,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:34:53,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 21:34:53,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:56,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=496466.6666666667, ans=0.0 2023-09-29 21:34:57,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:34:57,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:57,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:57,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:57,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:59,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 21:35:03,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:35:03,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:03,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:03,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:35:07,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 21:35:10,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:11,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:12,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:35:12,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=496533.3333333333, ans=0.0 2023-09-29 21:35:14,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:35:18,026 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 21:35:19,442 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 21:35:20,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:35:20,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:35:22,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=496600.0, ans=0.1 2023-09-29 21:35:25,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:35:27,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:27,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:33,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:34,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 21:35:36,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=496666.6666666667, ans=0.1 2023-09-29 21:35:37,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:35:42,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:35:44,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:35:45,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:48,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:52,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:35:52,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:35:55,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:57,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:57,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:57,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:35:58,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:00,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 21:36:00,174 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 21:36:00,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:00,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:36:02,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:02,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:02,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:36:02,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:36:03,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:36:03,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:05,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:06,711 INFO [train.py:1039] (2/4) Epoch 15, batch 150, loss[loss=0.2014, simple_loss=0.2683, pruned_loss=0.06724, over 23392.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2655, pruned_loss=0.06104, over 2497204.03 frames. ], batch size: 106, lr: 7.01e-03, grad_scale: 32.0 2023-09-29 21:36:06,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:08,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:36:08,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:36:10,747 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.49 vs. limit=15.0 2023-09-29 21:36:11,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:13,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:36:13,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:15,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:18,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:18,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:21,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:36:22,218 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.96 vs. limit=15.0 2023-09-29 21:36:23,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:29,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 21:36:29,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 21:36:29,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 21:36:32,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:36:32,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:36:32,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:36:34,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:34,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:34,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:34,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:36,089 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 21:36:39,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:41,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=496933.3333333333, ans=0.2 2023-09-29 21:36:43,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:46,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:36:48,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 21:36:52,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:36:52,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:52,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:36:54,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:36:54,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:56,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:36:57,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:57,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 21:37:00,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=497000.0, ans=0.0 2023-09-29 21:37:01,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:04,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:04,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:37:04,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:37:08,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:08,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=497000.0, ans=0.0 2023-09-29 21:37:09,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 21:37:12,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:37:14,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:37:15,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:19,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:37:19,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 21:37:19,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:37:19,410 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 21:37:23,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:26,181 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.806e+02 2.055e+02 2.590e+02 4.271e+02, threshold=4.110e+02, percent-clipped=0.0 2023-09-29 21:37:26,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:37:26,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:37:29,050 INFO [train.py:1039] (2/4) Epoch 15, batch 200, loss[loss=0.1933, simple_loss=0.2613, pruned_loss=0.06265, over 23475.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2659, pruned_loss=0.06039, over 2998651.80 frames. ], batch size: 134, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:37:29,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 21:37:30,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:30,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:34,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 21:37:36,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:37:36,786 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.21 vs. limit=15.0 2023-09-29 21:37:39,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:39,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:44,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:37:44,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:45,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:08,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:38:10,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:38:10,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:38:10,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:38:12,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 21:38:12,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:38:14,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=497266.6666666667, ans=0.125 2023-09-29 21:38:14,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=497266.6666666667, ans=0.125 2023-09-29 21:38:15,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:16,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:38:18,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:20,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:21,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 21:38:21,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:38:21,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:25,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:38:30,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:32,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=497333.3333333333, ans=0.0 2023-09-29 21:38:38,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:39,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:38:47,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:48,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 21:38:50,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:50,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:38:50,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:51,883 INFO [train.py:1039] (2/4) Epoch 15, batch 250, loss[loss=0.1718, simple_loss=0.2536, pruned_loss=0.04502, over 24325.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2642, pruned_loss=0.05968, over 3384867.10 frames. ], batch size: 61, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:38:51,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:38:53,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 21:38:54,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:38:54,926 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 21:38:56,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:58,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:39:02,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:02,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:39:03,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:39:03,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:05,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:39:05,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=497466.6666666667, ans=0.1 2023-09-29 21:39:09,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:39:20,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:24,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:39:25,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:39:31,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:39:31,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:39:33,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:39:35,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:35,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:39:35,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:39:35,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:38,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:39:41,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 21:39:43,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:44,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:39:45,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:39:45,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:39:45,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:39:47,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:39:47,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:39:48,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:39:50,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:39:50,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:39:55,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=497666.6666666667, ans=0.125 2023-09-29 21:39:56,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:40:00,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:02,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:40:06,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:08,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:40:12,378 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.826e+02 2.078e+02 2.374e+02 4.039e+02, threshold=4.156e+02, percent-clipped=0.0 2023-09-29 21:40:12,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 21:40:14,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:40:14,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:40:16,024 INFO [train.py:1039] (2/4) Epoch 15, batch 300, loss[loss=0.1752, simple_loss=0.2612, pruned_loss=0.0446, over 24544.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2623, pruned_loss=0.05873, over 3668606.62 frames. ], batch size: 71, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:40:16,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=497800.0, ans=0.125 2023-09-29 21:40:17,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 21:40:17,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:40:19,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:40:19,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 21:40:24,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:25,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:40:31,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:40:31,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 21:40:31,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:32,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:40:34,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 21:40:34,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:37,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:40:41,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:40:44,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 21:40:44,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=497866.6666666667, ans=0.07 2023-09-29 21:40:48,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 21:40:48,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:52,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:55,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:55,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 21:40:55,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:40:55,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:40:58,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:40:58,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:04,036 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.97 vs. limit=15.0 2023-09-29 21:41:04,305 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=12.0 2023-09-29 21:41:04,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:41:04,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 21:41:05,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:41:08,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:10,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 21:41:11,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:15,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:41:17,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:41:17,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 21:41:19,286 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.42 vs. limit=22.5 2023-09-29 21:41:21,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:21,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:41:23,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:25,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:41:26,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 21:41:26,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:41:28,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:29,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 21:41:32,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:34,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:34,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:35,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:37,674 INFO [train.py:1039] (2/4) Epoch 15, batch 350, loss[loss=0.2051, simple_loss=0.2831, pruned_loss=0.06349, over 24055.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2618, pruned_loss=0.0579, over 3902954.72 frames. ], batch size: 80, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:41:40,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:40,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:41:41,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=498133.3333333333, ans=0.2 2023-09-29 21:41:44,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:49,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:52,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:54,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:57,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 21:41:59,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:59,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 21:42:01,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:02,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 21:42:02,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:06,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 21:42:08,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:42:08,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=498266.6666666667, ans=0.125 2023-09-29 21:42:10,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:10,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:42:12,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:13,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:13,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:42:16,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:42:17,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:24,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:42:24,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:42:25,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:42:27,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:31,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 21:42:31,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:36,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=498333.3333333333, ans=0.125 2023-09-29 21:42:37,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:37,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:37,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:42:40,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 21:42:40,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:42,049 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 21:42:42,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=498400.0, ans=0.125 2023-09-29 21:42:43,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 21:42:43,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:45,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:45,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 21:42:48,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=498400.0, ans=0.125 2023-09-29 21:42:49,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:49,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=498400.0, ans=0.125 2023-09-29 21:42:50,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:42:52,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:54,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:54,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:56,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:57,504 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.856e+02 2.198e+02 2.696e+02 4.798e+02, threshold=4.395e+02, percent-clipped=2.0 2023-09-29 21:42:59,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:43:00,683 INFO [train.py:1039] (2/4) Epoch 15, batch 400, loss[loss=0.2005, simple_loss=0.2839, pruned_loss=0.05855, over 24327.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2608, pruned_loss=0.05754, over 4074079.16 frames. ], batch size: 74, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:43:00,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:43:02,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 21:43:02,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:03,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:05,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:43:07,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:10,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:12,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:13,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 21:43:13,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 21:43:13,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:15,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 21:43:15,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:21,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:43:21,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:21,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 21:43:21,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:43:22,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:22,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:22,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:27,353 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 21:43:27,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 21:43:28,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.14 vs. limit=15.0 2023-09-29 21:43:32,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:33,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:35,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 21:43:35,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=498600.0, ans=0.1 2023-09-29 21:43:36,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 21:43:39,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:43:42,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:43:45,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=498600.0, ans=0.04949747468305833 2023-09-29 21:43:48,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 21:43:52,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:43:54,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 21:43:58,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:44:01,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:44:02,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 21:44:04,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:44:04,950 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.59 vs. limit=15.0 2023-09-29 21:44:07,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:44:08,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:44:13,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:13,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 21:44:15,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:44:15,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=498733.3333333333, ans=0.125 2023-09-29 21:44:16,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 21:44:18,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:44:18,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:44:20,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 21:44:21,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:44:23,338 INFO [train.py:1039] (2/4) Epoch 15, batch 450, loss[loss=0.1875, simple_loss=0.2665, pruned_loss=0.05424, over 23310.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2617, pruned_loss=0.05782, over 4219876.29 frames. ], batch size: 93, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:44:23,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:44:23,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:44:25,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 21:44:25,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:44:26,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:44:27,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:44:28,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 21:44:30,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:44:32,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:44:33,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:44:43,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:45,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:44:45,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 21:44:46,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 21:44:47,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.39 vs. limit=12.0 2023-09-29 21:44:50,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=498866.6666666667, ans=0.125 2023-09-29 21:44:53,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:44:54,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:57,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:00,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:00,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:02,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=498933.3333333333, ans=0.05 2023-09-29 21:45:03,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 21:45:05,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 21:45:07,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 21:45:07,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:09,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:09,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:45:11,113 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 21:45:11,127 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 21:45:11,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:45:11,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=499000.0, ans=0.0 2023-09-29 21:45:13,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:45:13,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:45:17,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.92 vs. limit=15.0 2023-09-29 21:45:18,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:45:18,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:45:19,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:45:19,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 21:45:22,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:24,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:45:24,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:45:26,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 21:45:29,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:45:31,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 21:45:31,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 21:45:32,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:39,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:45:39,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=499066.6666666667, ans=0.125 2023-09-29 21:45:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:45:42,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:45:42,962 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 21:45:44,924 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.892e+02 2.180e+02 2.454e+02 3.588e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 21:45:46,452 INFO [train.py:1039] (2/4) Epoch 15, batch 500, loss[loss=0.1821, simple_loss=0.2627, pruned_loss=0.05077, over 24661.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2623, pruned_loss=0.05771, over 4337158.29 frames. ], batch size: 68, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:45:48,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:49,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:45:51,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:51,096 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 21:45:52,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 21:45:52,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:54,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:45:59,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:46:01,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:46:02,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:46:02,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:46:04,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:16,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:18,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:46:18,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:46:20,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:20,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 21:46:20,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:46:24,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:46:25,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:46:25,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:46:25,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:27,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 21:46:30,142 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 21:46:31,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:46:33,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:37,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:46:38,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 21:46:41,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:46:41,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:46:46,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:46:50,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:56,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:46:57,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=499400.0, ans=0.05 2023-09-29 21:47:01,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 21:47:01,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:01,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:47:04,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 21:47:04,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:47:06,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:07,615 INFO [train.py:1039] (2/4) Epoch 15, batch 550, loss[loss=0.1932, simple_loss=0.2774, pruned_loss=0.05451, over 24439.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2626, pruned_loss=0.05752, over 4410551.72 frames. ], batch size: 69, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:47:08,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=499466.6666666667, ans=0.0 2023-09-29 21:47:09,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 21:47:10,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 21:47:12,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:13,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 21:47:14,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:47:14,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:16,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:47:17,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:47:19,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:22,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 21:47:22,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:47:28,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:28,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=499533.3333333333, ans=0.125 2023-09-29 21:47:30,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:33,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:47:33,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:33,431 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=499533.3333333333, ans=0.0 2023-09-29 21:47:37,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 21:47:39,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 21:47:39,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=499600.0, ans=0.125 2023-09-29 21:47:41,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:47:42,138 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.30 vs. limit=22.5 2023-09-29 21:47:44,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:47:44,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:46,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:47:51,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:51,014 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 21:47:52,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:52,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:47:52,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=499600.0, ans=0.1 2023-09-29 21:47:55,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:57,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:47:57,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:47:59,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:59,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 21:48:02,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 21:48:04,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:04,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:48:06,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:06,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:48:08,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:48:09,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:48:12,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:48:13,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:14,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:48:16,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:48:18,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:19,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:48:19,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:21,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:48:21,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:48:25,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.09 vs. limit=12.0 2023-09-29 21:48:27,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 21:48:28,880 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.896e+02 2.048e+02 2.393e+02 3.212e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-29 21:48:30,459 INFO [train.py:1039] (2/4) Epoch 15, batch 600, loss[loss=0.1715, simple_loss=0.2566, pruned_loss=0.04321, over 24649.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2621, pruned_loss=0.05752, over 4473148.08 frames. ], batch size: 68, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:48:31,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 21:48:33,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:48:33,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:48:33,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:38,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=499800.0, ans=0.125 2023-09-29 21:48:41,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:48:41,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:48:42,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 21:48:45,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.50 vs. limit=6.0 2023-09-29 21:48:45,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:48:47,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:48:49,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:52,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 21:48:52,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:58,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 21:49:01,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:49:01,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:03,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:49:09,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:49:09,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:49:09,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:12,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=499933.3333333333, ans=0.0 2023-09-29 21:49:17,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:49:22,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:22,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:49:22,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:30,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 21:49:36,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=500066.6666666667, ans=0.2 2023-09-29 21:49:38,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:49:38,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:49:40,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=500066.6666666667, ans=0.0 2023-09-29 21:49:43,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 21:49:45,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:49:45,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=500066.6666666667, ans=0.2 2023-09-29 21:49:49,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 21:49:49,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:49:50,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:49:51,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=500066.6666666667, ans=0.0 2023-09-29 21:49:53,642 INFO [train.py:1039] (2/4) Epoch 15, batch 650, loss[loss=0.1738, simple_loss=0.2572, pruned_loss=0.04526, over 24521.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2612, pruned_loss=0.05683, over 4526831.37 frames. ], batch size: 66, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:49:53,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:49:55,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:49:57,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:49:59,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:50:00,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:04,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 21:50:05,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:50:10,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:50:10,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:15,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:19,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 21:50:20,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:20,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:25,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:50:25,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 21:50:28,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:30,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:32,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:50:32,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:33,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:50:35,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:50:35,501 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 21:50:35,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:35,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:40,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:40,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:40,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=500266.6666666667, ans=0.125 2023-09-29 21:50:41,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:50:43,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:50:44,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 21:50:44,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:50:44,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:50:46,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:50:46,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:48,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:50:48,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=500333.3333333333, ans=0.025 2023-09-29 21:50:50,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 21:50:52,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 21:50:52,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:52,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:52,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:50:53,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:55,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:59,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:00,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:01,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:51:01,963 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=500400.0, ans=0.0 2023-09-29 21:51:05,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:05,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 21:51:05,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:05,828 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=500400.0, ans=0.2 2023-09-29 21:51:12,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:51:12,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:14,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:14,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:15,839 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.967e+02 2.230e+02 2.701e+02 4.378e+02, threshold=4.460e+02, percent-clipped=5.0 2023-09-29 21:51:15,882 INFO [train.py:1039] (2/4) Epoch 15, batch 700, loss[loss=0.1823, simple_loss=0.2487, pruned_loss=0.05789, over 23364.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2593, pruned_loss=0.05666, over 4551930.31 frames. ], batch size: 285, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:51:20,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 21:51:21,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 21:51:22,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 21:51:24,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:25,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=500466.6666666667, ans=0.1 2023-09-29 21:51:28,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:51:29,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 21:51:32,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:36,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:51:36,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:39,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:51:39,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:41,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=500533.3333333333, ans=0.1 2023-09-29 21:51:42,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:45,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 21:51:45,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:51:47,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 21:51:49,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 21:51:53,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:51:54,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:51:54,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=500600.0, ans=0.0 2023-09-29 21:51:56,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:52:03,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:52:03,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 21:52:07,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:09,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:52:09,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 21:52:14,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:52:15,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:16,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=500666.6666666667, ans=0.2 2023-09-29 21:52:18,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:52:19,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=500666.6666666667, ans=0.2 2023-09-29 21:52:25,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:52:25,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 21:52:27,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 21:52:27,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=500733.3333333333, ans=0.125 2023-09-29 21:52:28,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 21:52:30,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:32,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:32,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:52:34,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:34,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 21:52:39,407 INFO [train.py:1039] (2/4) Epoch 15, batch 750, loss[loss=0.1897, simple_loss=0.2553, pruned_loss=0.06204, over 23809.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2596, pruned_loss=0.057, over 4580960.36 frames. ], batch size: 212, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:52:39,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=500800.0, ans=0.125 2023-09-29 21:52:40,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 21:52:41,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 21:52:41,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 21:52:42,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 21:52:42,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 21:52:44,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:52:45,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=500800.0, ans=0.125 2023-09-29 21:52:46,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 21:52:46,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:47,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:52:48,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:48,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=500800.0, ans=0.0 2023-09-29 21:52:49,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:50,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:52:51,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:53,009 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=500800.0, ans=0.125 2023-09-29 21:52:54,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:52:54,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:52:57,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:52:58,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:58,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:53:00,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 21:53:01,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:53:03,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:53:05,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.68 vs. limit=15.0 2023-09-29 21:53:06,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 21:53:07,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:10,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 21:53:10,094 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 21:53:11,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 21:53:11,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:53:11,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:53:14,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:53:21,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:53:22,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:22,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:53:24,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:53:25,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:53:26,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 21:53:27,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:53:27,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=501000.0, ans=0.0 2023-09-29 21:53:29,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:53:29,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:53:31,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=501000.0, ans=0.0 2023-09-29 21:53:32,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:53:32,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 21:53:34,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:40,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:53:41,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:53:42,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:53:44,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:53:49,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 21:53:49,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:53:49,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:54,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:57,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:57,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:54:02,040 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.875e+02 2.035e+02 2.283e+02 3.726e+02, threshold=4.071e+02, percent-clipped=0.0 2023-09-29 21:54:02,083 INFO [train.py:1039] (2/4) Epoch 15, batch 800, loss[loss=0.2043, simple_loss=0.2714, pruned_loss=0.06857, over 23669.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2606, pruned_loss=0.05734, over 4614676.04 frames. ], batch size: 232, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:54:03,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:03,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:07,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:54:07,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:09,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:09,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:11,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:15,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:15,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:54:17,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=501200.0, ans=0.125 2023-09-29 21:54:18,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 21:54:20,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:21,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:21,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:54:21,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:23,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 21:54:23,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:24,840 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.02 vs. limit=15.0 2023-09-29 21:54:25,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 21:54:28,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:32,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:35,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:54:35,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:38,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:38,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:42,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:54:43,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:54:43,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:54:47,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 21:54:47,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 21:54:47,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:54:47,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:48,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:48,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:54:50,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.06 vs. limit=15.0 2023-09-29 21:54:55,131 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 21:54:55,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 21:54:58,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:54:59,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:55:04,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:55:09,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:11,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 21:55:11,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:55:14,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 21:55:21,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:24,939 INFO [train.py:1039] (2/4) Epoch 15, batch 850, loss[loss=0.205, simple_loss=0.2912, pruned_loss=0.05944, over 24329.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2616, pruned_loss=0.0575, over 4643748.98 frames. ], batch size: 74, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:55:24,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:55:25,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 21:55:25,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:55:26,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:26,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=501466.6666666667, ans=0.125 2023-09-29 21:55:28,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 21:55:28,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.13 vs. limit=15.0 2023-09-29 21:55:29,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:31,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:55:32,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:34,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:55:35,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:55:37,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 21:55:37,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 21:55:39,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 21:55:40,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:41,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:55:42,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:42,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:44,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:55:50,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:50,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:51,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 21:55:55,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 21:55:58,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:56:01,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 21:56:05,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 21:56:07,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 21:56:08,830 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 21:56:08,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:08,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:56:08,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:56:11,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:12,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:13,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 21:56:16,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:17,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:18,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:56:18,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:56:20,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:56:21,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-09-29 21:56:22,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:56:24,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 21:56:27,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:56:28,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:28,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:56:28,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:30,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:30,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=501733.3333333333, ans=0.0 2023-09-29 21:56:34,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:36,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:56:37,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:56:39,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:56:39,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:56:45,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=501800.0, ans=0.1 2023-09-29 21:56:46,692 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.806e+02 2.003e+02 2.266e+02 2.717e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-29 21:56:46,735 INFO [train.py:1039] (2/4) Epoch 15, batch 900, loss[loss=0.211, simple_loss=0.2754, pruned_loss=0.07331, over 23908.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2626, pruned_loss=0.05817, over 4658733.30 frames. ], batch size: 180, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:56:48,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:56:50,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:50,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 21:56:51,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:56:51,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:53,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 21:56:59,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=501800.0, ans=0.125 2023-09-29 21:57:00,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:57:03,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:03,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 21:57:07,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:57:07,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 21:57:09,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:57:09,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:57:09,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:10,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:57:10,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:57:11,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=501866.6666666667, ans=0.0 2023-09-29 21:57:22,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:22,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:22,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:57:25,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:28,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 21:57:31,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:57:36,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:57:36,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:57:38,377 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 21:57:38,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 21:57:45,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:57:46,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:57:46,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:57:53,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:53,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:57:54,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 21:57:54,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:56,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 21:57:58,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:57:58,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:01,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:01,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:06,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 21:58:06,847 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 21:58:08,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:58:08,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 21:58:09,344 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.83 vs. limit=22.5 2023-09-29 21:58:09,934 INFO [train.py:1039] (2/4) Epoch 15, batch 950, loss[loss=0.1686, simple_loss=0.2415, pruned_loss=0.04782, over 20992.00 frames. ], tot_loss[loss=0.191, simple_loss=0.264, pruned_loss=0.05895, over 4659173.94 frames. ], batch size: 46, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:58:12,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:15,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 21:58:21,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:23,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:23,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:25,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:58:28,121 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 21:58:30,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:30,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:30,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:30,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:58:31,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 21:58:33,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:58:35,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:36,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 21:58:36,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:43,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:43,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:43,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:45,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 21:58:49,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:58:50,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:52,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:58:58,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:58,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:59:01,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 21:59:02,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 21:59:02,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:59:04,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:04,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:04,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:59:08,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=502333.3333333333, ans=0.125 2023-09-29 21:59:09,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 21:59:11,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=502333.3333333333, ans=0.125 2023-09-29 21:59:12,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:59:16,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:16,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:16,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 21:59:16,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:16,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:59:18,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 21:59:21,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=502400.0, ans=0.0 2023-09-29 21:59:23,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:59:24,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:28,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:29,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 21:59:29,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 21:59:32,781 INFO [train.py:1039] (2/4) Epoch 15, batch 1000, loss[loss=0.2048, simple_loss=0.2682, pruned_loss=0.07064, over 23169.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2631, pruned_loss=0.05826, over 4687002.12 frames. ], batch size: 119, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 21:59:33,760 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.74 vs. limit=15.0 2023-09-29 21:59:34,241 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.874e+02 2.213e+02 2.619e+02 3.676e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 21:59:34,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:37,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 21:59:38,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:59:45,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:59:47,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 21:59:47,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 21:59:53,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:59:53,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:54,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:58,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 22:00:00,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 22:00:01,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 22:00:01,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:04,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 22:00:06,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:00:06,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 22:00:07,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:09,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:17,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:17,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:00:17,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:20,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:20,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 22:00:20,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:20,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:00:21,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:21,910 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 22:00:24,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=502666.6666666667, ans=0.1 2023-09-29 22:00:25,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 22:00:27,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 22:00:30,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 22:00:32,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:00:39,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:39,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:00:39,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:40,108 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=502733.3333333333, ans=0.0 2023-09-29 22:00:42,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:00:42,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 22:00:44,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:00:45,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 22:00:45,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 22:00:47,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:00:47,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:51,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:00:54,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:00:56,345 INFO [train.py:1039] (2/4) Epoch 15, batch 1050, loss[loss=0.2025, simple_loss=0.2644, pruned_loss=0.07025, over 23799.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2605, pruned_loss=0.05806, over 4684210.73 frames. ], batch size: 195, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:00:56,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:01:00,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:01:01,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:01:03,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:01:04,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:07,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:08,601 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.01 vs. limit=15.0 2023-09-29 22:01:10,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:01:12,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:01:14,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:01:15,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:01:15,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:01:15,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=502866.6666666667, ans=0.125 2023-09-29 22:01:17,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:01:17,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 22:01:17,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=502866.6666666667, ans=0.0 2023-09-29 22:01:18,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:18,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 22:01:20,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:01:20,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 22:01:20,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:01:27,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:29,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:01:29,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:33,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 22:01:34,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 22:01:34,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:36,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 22:01:38,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 22:01:38,723 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-09-29 22:01:39,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:01:44,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:01:47,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:01:47,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:01:48,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:01:51,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:01:55,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 22:01:56,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 22:01:56,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 22:01:56,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:01:56,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:01:58,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=503000.0, ans=0.125 2023-09-29 22:02:00,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 22:02:05,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:02:07,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:02:07,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:08,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:08,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:13,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:13,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 22:02:14,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:14,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 22:02:15,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 22:02:16,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:02:17,878 INFO [train.py:1039] (2/4) Epoch 15, batch 1100, loss[loss=0.2149, simple_loss=0.2732, pruned_loss=0.07828, over 23740.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2601, pruned_loss=0.05787, over 4692372.98 frames. ], batch size: 164, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:02:19,328 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.797e+02 2.092e+02 2.502e+02 4.130e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 22:02:19,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:02:25,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:02:30,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:02:31,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:02:31,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:33,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 22:02:35,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:02:37,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:02:39,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:02:43,230 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:02:44,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:02:44,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 22:02:47,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:02:47,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:47,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:47,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=503200.0, ans=0.0 2023-09-29 22:02:50,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:02:52,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:02:57,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:03:00,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 22:03:01,742 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 22:03:01,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:04,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:04,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:03:06,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:03:08,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 22:03:08,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:03:10,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:03:10,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:03:10,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:10,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 22:03:17,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:03:17,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 22:03:20,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:03:22,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=503333.3333333333, ans=0.1 2023-09-29 22:03:24,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:03:25,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=503400.0, ans=0.125 2023-09-29 22:03:26,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 22:03:26,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:03:28,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:32,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:32,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:33,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 22:03:34,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:03:35,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:37,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 22:03:37,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:03:37,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 22:03:38,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:03:38,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:03:40,048 INFO [train.py:1039] (2/4) Epoch 15, batch 1150, loss[loss=0.178, simple_loss=0.2529, pruned_loss=0.05152, over 24325.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2603, pruned_loss=0.05746, over 4709009.70 frames. ], batch size: 61, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:03:40,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:03:43,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:46,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=503466.6666666667, ans=0.0 2023-09-29 22:03:48,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:03:50,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:50,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:03:50,632 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.46 vs. limit=12.0 2023-09-29 22:03:51,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 22:03:51,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=503466.6666666667, ans=0.2 2023-09-29 22:03:52,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:03:53,314 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=503466.6666666667, ans=0.125 2023-09-29 22:03:53,719 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.78 vs. limit=22.5 2023-09-29 22:03:56,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 22:03:57,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:57,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:04:05,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 22:04:06,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:09,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:04:10,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:10,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 22:04:10,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:04:10,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:04:13,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 22:04:14,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:16,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:04:16,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=503600.0, ans=0.125 2023-09-29 22:04:29,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 22:04:36,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:36,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:43,183 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 22:04:44,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:52,392 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 22:04:58,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:04:59,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:05:00,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:05:01,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:05:02,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=503800.0, ans=0.0 2023-09-29 22:05:03,924 INFO [train.py:1039] (2/4) Epoch 15, batch 1200, loss[loss=0.1923, simple_loss=0.2719, pruned_loss=0.05639, over 23473.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2608, pruned_loss=0.0576, over 4700736.56 frames. ], batch size: 93, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:05:05,385 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.402e+02 1.796e+02 2.044e+02 2.374e+02 3.909e+02, threshold=4.087e+02, percent-clipped=0.0 2023-09-29 22:05:05,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:07,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=503800.0, ans=0.2 2023-09-29 22:05:08,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=503800.0, ans=0.0 2023-09-29 22:05:11,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:05:11,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:05:14,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:14,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:14,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:05:16,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:05:18,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:05:19,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:19,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=503866.6666666667, ans=0.1 2023-09-29 22:05:21,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:21,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=503866.6666666667, ans=0.2 2023-09-29 22:05:22,693 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 22:05:24,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 22:05:27,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:05:30,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:05:33,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:36,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:05:37,002 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 22:05:38,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:44,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:05:44,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:05:46,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 22:05:46,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:05:49,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 22:05:55,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 22:05:55,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:57,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:58,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:05:59,847 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.46 vs. limit=15.0 2023-09-29 22:06:00,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:06:01,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:06:01,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:06:03,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:06:03,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 22:06:05,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:06:05,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:05,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:06:06,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:06,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:06:13,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:06:14,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:06:14,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=504066.6666666667, ans=0.1 2023-09-29 22:06:17,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 22:06:21,953 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 22:06:23,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:24,964 INFO [train.py:1039] (2/4) Epoch 15, batch 1250, loss[loss=0.1709, simple_loss=0.2512, pruned_loss=0.04532, over 24298.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2619, pruned_loss=0.05789, over 4704588.40 frames. ], batch size: 61, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:06:26,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:29,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:06:29,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:34,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 22:06:35,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=504133.3333333333, ans=0.125 2023-09-29 22:06:37,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:06:39,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:40,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 22:06:42,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:06:42,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:06:48,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:06:49,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:51,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:06:51,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:52,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:06:53,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=504200.0, ans=0.125 2023-09-29 22:06:57,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:06:57,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:06:57,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:59,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:59,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:02,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:03,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:07:08,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 22:07:08,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:07:11,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:11,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 22:07:13,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:07:13,243 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 22:07:13,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:13,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:19,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:22,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:23,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:07:25,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 22:07:25,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 22:07:25,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 22:07:28,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:30,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 22:07:30,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:33,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:07:33,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:07:34,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 22:07:34,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:07:35,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=504400.0, ans=0.2 2023-09-29 22:07:36,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:07:36,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:07:37,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:37,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 22:07:40,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:42,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:07:42,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:07:46,060 INFO [train.py:1039] (2/4) Epoch 15, batch 1300, loss[loss=0.1856, simple_loss=0.243, pruned_loss=0.06412, over 22800.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2618, pruned_loss=0.05759, over 4715493.52 frames. ], batch size: 322, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:07:46,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:07:48,109 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.942e+02 2.297e+02 2.853e+02 4.160e+02, threshold=4.593e+02, percent-clipped=1.0 2023-09-29 22:07:50,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:50,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 22:07:55,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:55,799 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:07:58,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:07:59,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:00,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:08:00,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:08:01,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 22:08:05,434 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.40 vs. limit=15.0 2023-09-29 22:08:07,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:08:09,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:08:10,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 22:08:12,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:08:17,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:17,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=504600.0, ans=0.0 2023-09-29 22:08:19,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:20,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:08:22,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:22,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:08:24,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:08:24,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 22:08:24,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=504600.0, ans=0.2 2023-09-29 22:08:26,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=504600.0, ans=0.125 2023-09-29 22:08:30,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:08:30,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:08:31,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 22:08:32,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:08:33,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:08:38,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:38,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 22:08:38,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:38,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 22:08:39,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:42,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:42,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:08:47,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 22:08:49,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 22:08:51,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 22:08:56,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:08:58,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 22:09:01,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:07,788 INFO [train.py:1039] (2/4) Epoch 15, batch 1350, loss[loss=0.201, simple_loss=0.2653, pruned_loss=0.06832, over 23827.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2618, pruned_loss=0.05783, over 4712735.74 frames. ], batch size: 164, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:09:07,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 22:09:09,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:11,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.80 vs. limit=15.0 2023-09-29 22:09:12,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:14,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=504800.0, ans=0.125 2023-09-29 22:09:15,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:17,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:18,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:09:18,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:22,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=504866.6666666667, ans=0.125 2023-09-29 22:09:26,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:26,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 22:09:27,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:09:29,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:09:33,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 22:09:33,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:09:35,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:09:35,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 22:09:36,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 22:09:39,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 22:09:41,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:41,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 22:09:46,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=504933.3333333333, ans=0.1 2023-09-29 22:09:53,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:02,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.18 vs. limit=10.0 2023-09-29 22:10:03,013 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=12.0 2023-09-29 22:10:03,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:03,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:03,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 22:10:08,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:10,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 22:10:10,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:10:11,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:10:14,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:10:16,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 22:10:17,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:10:19,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=505066.6666666667, ans=0.125 2023-09-29 22:10:22,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 22:10:24,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 22:10:29,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 22:10:29,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:31,214 INFO [train.py:1039] (2/4) Epoch 15, batch 1400, loss[loss=0.1637, simple_loss=0.2279, pruned_loss=0.04978, over 23465.00 frames. ], tot_loss[loss=0.188, simple_loss=0.261, pruned_loss=0.05753, over 4713955.10 frames. ], batch size: 285, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:10:33,172 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.906e+02 2.114e+02 2.329e+02 4.269e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-29 22:10:34,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:10:35,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:10:43,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 22:10:44,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=505133.3333333333, ans=0.125 2023-09-29 22:10:45,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 22:10:53,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:10:55,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=505200.0, ans=0.0 2023-09-29 22:10:56,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:10:59,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:10:59,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:11:02,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:11:04,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 22:11:14,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:15,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:20,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 22:11:22,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:11:23,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:11:25,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:11:25,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:11:27,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:11:27,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:11:28,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:11:28,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 22:11:29,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:11:33,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=505333.3333333333, ans=0.0 2023-09-29 22:11:34,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:37,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:11:43,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 22:11:44,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:11:44,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:11:47,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=505400.0, ans=0.1 2023-09-29 22:11:50,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 22:11:51,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:11:53,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:11:54,999 INFO [train.py:1039] (2/4) Epoch 15, batch 1450, loss[loss=0.1876, simple_loss=0.2596, pruned_loss=0.05777, over 24489.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.26, pruned_loss=0.05744, over 4697352.19 frames. ], batch size: 63, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:11:56,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:12:01,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:12:01,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:01,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:12:05,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:07,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:12:07,663 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:12:08,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:12:08,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 22:12:10,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:12:10,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=505533.3333333333, ans=0.125 2023-09-29 22:12:11,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 22:12:12,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:13,025 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.36 vs. limit=15.0 2023-09-29 22:12:13,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:13,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 22:12:13,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:15,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:12:16,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 22:12:16,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:18,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:12:20,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:23,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:26,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:12:27,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:12:29,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:29,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:30,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:30,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:12:31,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:31,537 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.24 vs. limit=15.0 2023-09-29 22:12:32,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:36,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 22:12:39,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:44,367 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 22:12:45,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:12:46,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:12:47,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:12:49,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 22:12:51,481 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.95 vs. limit=22.5 2023-09-29 22:12:53,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:55,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=505666.6666666667, ans=0.125 2023-09-29 22:12:56,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 22:12:59,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 22:13:00,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:03,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:05,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:06,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 22:13:08,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 22:13:08,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 22:13:09,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:10,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=505733.3333333333, ans=0.125 2023-09-29 22:13:11,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:13:16,231 INFO [train.py:1039] (2/4) Epoch 15, batch 1500, loss[loss=0.2045, simple_loss=0.2674, pruned_loss=0.07081, over 23837.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2609, pruned_loss=0.05735, over 4707258.48 frames. ], batch size: 179, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:13:17,606 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.853e+02 2.114e+02 2.421e+02 4.526e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-29 22:13:21,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 22:13:22,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:13:22,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:13:22,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:24,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:26,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:13:27,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 22:13:29,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:13:29,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:13:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:31,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:31,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=505866.6666666667, ans=0.09899494936611666 2023-09-29 22:13:32,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:13:34,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:38,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:38,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 22:13:39,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:13:40,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:13:40,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:43,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 22:13:48,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 22:13:49,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:51,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 22:13:54,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:13:57,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:13:57,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:57,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:58,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 22:13:58,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:13:58,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:00,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 22:14:02,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:06,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:14:06,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 22:14:07,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=506000.0, ans=0.2 2023-09-29 22:14:12,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:14:14,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:14:20,419 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 22:14:20,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:20,492 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 22:14:20,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:22,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:14:23,640 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 22:14:25,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:14:29,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 22:14:31,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:35,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:14:36,450 INFO [train.py:1039] (2/4) Epoch 15, batch 1550, loss[loss=0.1523, simple_loss=0.2276, pruned_loss=0.03846, over 24323.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2616, pruned_loss=0.05733, over 4708632.20 frames. ], batch size: 56, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:14:36,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 22:14:38,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 22:14:38,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:14:39,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 22:14:39,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 22:14:43,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:45,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:46,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:14:46,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:14:48,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:48,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:52,819 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 22:14:54,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:54,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:14:54,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:14:57,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:14:57,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 22:14:58,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:59,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 22:15:01,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 22:15:01,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 22:15:02,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:03,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:03,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=506200.0, ans=0.1 2023-09-29 22:15:05,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=506200.0, ans=0.2 2023-09-29 22:15:08,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:15:11,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 22:15:11,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 22:15:19,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:22,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:15:24,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:15:24,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:15:25,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 22:15:30,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:15:31,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:34,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:15:35,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=506333.3333333333, ans=0.125 2023-09-29 22:15:36,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=506333.3333333333, ans=0.2 2023-09-29 22:15:37,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:15:39,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:39,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 22:15:39,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:15:40,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:15:40,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:42,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:15:42,421 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 22:15:47,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:15:52,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 22:15:57,913 INFO [train.py:1039] (2/4) Epoch 15, batch 1600, loss[loss=0.1657, simple_loss=0.243, pruned_loss=0.04417, over 24674.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2618, pruned_loss=0.05708, over 4725306.35 frames. ], batch size: 65, lr: 6.95e-03, grad_scale: 32.0 2023-09-29 22:15:59,415 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.908e+02 2.211e+02 2.589e+02 3.896e+02, threshold=4.422e+02, percent-clipped=0.0 2023-09-29 22:15:59,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:15:59,967 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:16:01,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:01,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 22:16:02,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:16:02,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:16:02,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:16:02,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:16:04,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:16:06,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=506466.6666666667, ans=0.125 2023-09-29 22:16:07,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:09,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 22:16:10,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 22:16:12,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 22:16:15,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:15,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 22:16:16,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:16:19,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:16:23,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:16:25,527 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=506533.3333333333, ans=0.125 2023-09-29 22:16:28,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 22:16:28,512 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:16:31,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:16:33,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 22:16:34,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:34,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 22:16:37,144 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.21 vs. limit=15.0 2023-09-29 22:16:39,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 22:16:45,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:45,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 22:16:50,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:50,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:50,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:16:52,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 22:16:57,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=506666.6666666667, ans=0.0 2023-09-29 22:16:58,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:17:00,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:17:00,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:00,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:01,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:17:05,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:17:06,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:17:06,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:17:14,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:14,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:17:17,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 22:17:17,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:17:17,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 22:17:23,518 INFO [train.py:1039] (2/4) Epoch 15, batch 1650, loss[loss=0.1737, simple_loss=0.2637, pruned_loss=0.04185, over 24429.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2626, pruned_loss=0.05758, over 4720883.76 frames. ], batch size: 69, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:17:24,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=506800.0, ans=0.125 2023-09-29 22:17:25,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:25,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:17:27,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:17:27,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 22:17:27,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 22:17:27,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 22:17:27,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=506800.0, ans=0.125 2023-09-29 22:17:28,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 22:17:29,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=506800.0, ans=0.125 2023-09-29 22:17:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:33,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:34,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:17:34,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:17:37,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:39,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 22:17:44,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:17:44,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:44,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:17:44,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:17:45,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 22:17:45,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 22:17:53,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:17:53,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=506866.6666666667, ans=0.125 2023-09-29 22:17:55,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:18:03,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 22:18:03,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:07,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 22:18:10,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:12,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:18:13,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:18:14,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:14,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:18:14,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:17,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:19,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:19,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:19,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:20,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:21,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:18:24,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:26,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 22:18:27,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:27,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 22:18:27,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=507000.0, ans=0.05 2023-09-29 22:18:28,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 22:18:28,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 22:18:28,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:29,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=507066.6666666667, ans=0.0 2023-09-29 22:18:30,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:18:30,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:30,980 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-09-29 22:18:31,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:31,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 22:18:35,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:38,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:18:38,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:41,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 22:18:42,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=507066.6666666667, ans=0.125 2023-09-29 22:18:47,117 INFO [train.py:1039] (2/4) Epoch 15, batch 1700, loss[loss=0.19, simple_loss=0.2546, pruned_loss=0.06265, over 23746.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2622, pruned_loss=0.05729, over 4727381.88 frames. ], batch size: 164, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:18:48,608 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 1.972e+02 2.169e+02 2.497e+02 4.927e+02, threshold=4.339e+02, percent-clipped=2.0 2023-09-29 22:18:48,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:48,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:18:48,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 22:18:50,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:18:50,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:18:51,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:54,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:18:54,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:18:54,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 22:18:57,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:19:02,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=507200.0, ans=0.125 2023-09-29 22:19:05,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:06,150 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.52 vs. limit=15.0 2023-09-29 22:19:07,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:19:14,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:19:14,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:14,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:19:14,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:19,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 22:19:21,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:19:21,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:22,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:19:24,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:19:26,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 22:19:26,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 22:19:27,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:29,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 22:19:30,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:19:38,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:38,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:40,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:41,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:19:41,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 22:19:41,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:45,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:45,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 22:19:45,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:19:45,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:47,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:47,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:19:50,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:50,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:19:51,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:51,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:19:51,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:54,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=507400.0, ans=0.0 2023-09-29 22:19:56,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:57,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 22:20:00,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:00,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:01,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=507400.0, ans=0.0 2023-09-29 22:20:02,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=507400.0, ans=0.0 2023-09-29 22:20:04,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 22:20:08,729 INFO [train.py:1039] (2/4) Epoch 15, batch 1750, loss[loss=0.1712, simple_loss=0.2406, pruned_loss=0.05087, over 23738.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2612, pruned_loss=0.05693, over 4723144.99 frames. ], batch size: 232, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:20:08,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:11,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:11,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:20:14,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 22:20:14,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:20:17,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:20:18,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:23,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=507466.6666666667, ans=0.0 2023-09-29 22:20:24,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 22:20:24,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=507533.3333333333, ans=0.125 2023-09-29 22:20:26,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:28,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 22:20:28,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:20:29,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:20:34,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:20:35,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 22:20:37,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:20:37,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 22:20:46,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:20:50,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:20:50,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:53,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:53,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:55,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:57,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:00,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:02,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:02,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 22:21:03,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:07,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 22:21:07,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:10,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:11,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:21:16,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:21:16,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 22:21:16,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:19,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:19,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=507733.3333333333, ans=0.05 2023-09-29 22:21:22,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:25,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:27,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:21:29,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 22:21:29,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:30,900 INFO [train.py:1039] (2/4) Epoch 15, batch 1800, loss[loss=0.1856, simple_loss=0.2555, pruned_loss=0.05786, over 23623.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2608, pruned_loss=0.0568, over 4733842.07 frames. ], batch size: 149, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:21:30,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:21:30,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:31,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:21:31,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:21:31,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:21:34,570 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.864e+02 2.038e+02 2.350e+02 3.855e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-29 22:21:34,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:21:36,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:37,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:21:38,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=507800.0, ans=0.0 2023-09-29 22:21:40,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:41,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=507800.0, ans=0.02 2023-09-29 22:21:42,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:21:45,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:21:48,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:21:51,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:51,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:53,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:21:54,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:54,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 22:21:56,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:59,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:03,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 22:22:07,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 22:22:07,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 22:22:07,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:08,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:22:08,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:10,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:22:11,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=507933.3333333333, ans=0.2 2023-09-29 22:22:17,334 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 22:22:17,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:22:19,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=508000.0, ans=0.1 2023-09-29 22:22:20,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:20,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=508000.0, ans=0.125 2023-09-29 22:22:22,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 22:22:23,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 22:22:23,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:22:25,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:22:27,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:22:33,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 22:22:38,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:22:39,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=508066.6666666667, ans=0.2 2023-09-29 22:22:40,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 22:22:42,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:22:42,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:42,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:22:43,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 22:22:45,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:22:45,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:22:49,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 22:22:49,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:50,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:51,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:22:51,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:51,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:52,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:22:54,102 INFO [train.py:1039] (2/4) Epoch 15, batch 1850, loss[loss=0.2111, simple_loss=0.2757, pruned_loss=0.07324, over 23597.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2606, pruned_loss=0.05682, over 4738626.04 frames. ], batch size: 256, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:22:55,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:55,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:58,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:23:00,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:08,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:23:08,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 22:23:15,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 22:23:18,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 22:23:20,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=508200.0, ans=0.07 2023-09-29 22:23:23,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:23,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 22:23:23,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 22:23:30,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=508266.6666666667, ans=0.125 2023-09-29 22:23:32,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=508266.6666666667, ans=0.1 2023-09-29 22:23:33,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:23:33,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 22:23:35,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=508266.6666666667, ans=0.125 2023-09-29 22:23:36,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:23:37,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:23:41,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 22:23:41,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:41,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:23:43,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:23:45,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:47,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:23:47,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=508333.3333333333, ans=0.125 2023-09-29 22:23:51,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:23:51,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:52,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:23:52,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:52,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=508333.3333333333, ans=0.125 2023-09-29 22:23:53,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:23:55,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:23:56,158 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.53 vs. limit=12.0 2023-09-29 22:23:58,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 22:24:00,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:24:03,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:24:05,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:24:05,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 22:24:05,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 22:24:06,713 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 22:24:08,175 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 22:24:09,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:24:09,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:24:09,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:09,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:09,929 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 22:24:09,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:24:11,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:11,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:24:15,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:24:17,056 INFO [train.py:1039] (2/4) Epoch 15, batch 1900, loss[loss=0.1813, simple_loss=0.2671, pruned_loss=0.04775, over 24467.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2615, pruned_loss=0.05744, over 4736494.58 frames. ], batch size: 66, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:24:17,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:24:17,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 22:24:18,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:18,863 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 22:24:18,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:24:20,830 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.037e+02 2.291e+02 2.918e+02 4.608e+02, threshold=4.583e+02, percent-clipped=3.0 2023-09-29 22:24:20,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:23,266 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.59 vs. limit=22.5 2023-09-29 22:24:25,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:28,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:24:29,942 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 22:24:30,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 22:24:33,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:33,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:24:34,993 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 22:24:35,051 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 22:24:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 22:24:41,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:24:44,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 22:24:45,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 22:24:45,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=508533.3333333333, ans=0.0 2023-09-29 22:24:46,425 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.34 vs. limit=22.5 2023-09-29 22:24:47,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=508600.0, ans=0.05 2023-09-29 22:24:47,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=508600.0, ans=0.1 2023-09-29 22:24:59,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 22:25:01,803 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.28 vs. limit=15.0 2023-09-29 22:25:02,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 22:25:02,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:02,771 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 22:25:02,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 22:25:03,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=508600.0, ans=0.125 2023-09-29 22:25:04,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 22:25:04,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 22:25:04,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:08,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 22:25:09,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=508666.6666666667, ans=0.1 2023-09-29 22:25:12,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:25:16,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:16,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 22:25:17,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=508666.6666666667, ans=0.1 2023-09-29 22:25:18,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:25:21,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 22:25:21,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:28,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:25:28,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:25:28,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:25:30,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:25:32,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:25:32,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:25:33,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:25:34,875 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.94 vs. limit=15.0 2023-09-29 22:25:35,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=508733.3333333333, ans=0.125 2023-09-29 22:25:36,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:36,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:25:38,208 INFO [train.py:1039] (2/4) Epoch 15, batch 1950, loss[loss=0.2325, simple_loss=0.2919, pruned_loss=0.08656, over 22688.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2626, pruned_loss=0.05843, over 4730158.62 frames. ], batch size: 322, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:25:39,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:25:39,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:41,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:41,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:44,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:25:47,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:25:47,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:47,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:25:49,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 22:25:51,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:25:51,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:53,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:54,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:25:56,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:25:56,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:58,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:25:59,966 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:26:01,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:26:01,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:26:01,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:26:01,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:06,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:07,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=508866.6666666667, ans=0.035 2023-09-29 22:26:08,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:26:08,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:08,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:26:08,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 22:26:09,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:26:09,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:26:10,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:13,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:16,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:26:22,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:26:25,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:26:25,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:26:25,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 22:26:25,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:26:32,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:26:33,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:26:35,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:26:42,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:44,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:46,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:49,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:51,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:26:52,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:53,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 22:26:53,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:26:53,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:53,725 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=22.5 2023-09-29 22:26:56,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 22:26:58,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:26:59,705 INFO [train.py:1039] (2/4) Epoch 15, batch 2000, loss[loss=0.1804, simple_loss=0.271, pruned_loss=0.0449, over 24653.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.263, pruned_loss=0.05827, over 4731050.39 frames. ], batch size: 73, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:27:01,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=509133.3333333333, ans=0.125 2023-09-29 22:27:02,736 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.867e+02 2.115e+02 2.554e+02 3.825e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 22:27:02,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:27:05,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:27:05,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:06,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:27:09,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:13,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 22:27:13,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:27:18,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:27:21,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 22:27:21,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:27:21,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:27:24,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:27:25,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 22:27:27,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:27,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:27,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=509200.0, ans=0.0 2023-09-29 22:27:27,873 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.603e-02 2023-09-29 22:27:28,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:29,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 22:27:29,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:27:31,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 22:27:31,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:35,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:27:37,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:27:37,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:39,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:39,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:27:40,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 22:27:44,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 22:27:44,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:44,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:27:50,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:51,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:27:51,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:53,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:53,915 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.86 vs. limit=10.0 2023-09-29 22:27:54,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:54,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:56,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:56,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:57,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:00,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:28:02,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 22:28:07,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:28:09,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:28:16,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:18,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:18,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:19,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:28:19,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:28:23,138 INFO [train.py:1039] (2/4) Epoch 15, batch 2050, loss[loss=0.1895, simple_loss=0.2567, pruned_loss=0.0611, over 23887.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2623, pruned_loss=0.05856, over 4718354.28 frames. ], batch size: 195, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:28:23,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:24,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:27,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:27,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:31,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:28:34,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:28:34,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:34,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:28:34,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=509466.6666666667, ans=0.1 2023-09-29 22:28:37,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 22:28:37,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:28:38,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:28:38,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:28:50,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:28:50,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:54,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 22:28:55,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:57,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 22:28:57,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:28:59,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=509600.0, ans=0.1 2023-09-29 22:29:00,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=509600.0, ans=0.0 2023-09-29 22:29:02,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:04,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:06,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:29:06,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:08,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:29:09,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:29:11,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:29:14,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:16,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:29:17,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:29:20,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.23 vs. limit=22.5 2023-09-29 22:29:21,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:29:23,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:31,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:29:32,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 22:29:36,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:37,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:29:40,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:29:42,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 22:29:43,639 INFO [train.py:1039] (2/4) Epoch 15, batch 2100, loss[loss=0.1975, simple_loss=0.2851, pruned_loss=0.05497, over 24340.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.261, pruned_loss=0.05765, over 4709770.57 frames. ], batch size: 74, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:29:45,600 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 22:29:45,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:45,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:46,892 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.817e+02 2.090e+02 2.571e+02 3.864e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 22:29:47,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:29:48,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:48,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 22:29:48,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 22:29:50,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:55,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:29:56,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:29:57,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:59,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:29:59,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 22:30:01,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:30:01,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 22:30:01,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 22:30:04,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:04,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:04,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 22:30:04,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 22:30:09,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 22:30:09,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:30:13,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:30:14,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:30:18,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:30:18,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 22:30:20,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:20,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:30:21,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 22:30:21,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:21,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 22:30:21,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 22:30:23,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 22:30:25,444 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.30 vs. limit=15.0 2023-09-29 22:30:26,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:30:30,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:30:32,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:33,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:35,287 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.29 vs. limit=12.0 2023-09-29 22:30:35,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:36,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=510000.0, ans=0.125 2023-09-29 22:30:37,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 22:30:37,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:37,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:37,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 22:30:40,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 22:30:40,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 22:30:43,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:30:46,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:46,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 22:30:51,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:54,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:30:56,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:30:56,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:30:56,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:30:56,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:30:57,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:57,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:30:57,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=510066.6666666667, ans=0.125 2023-09-29 22:30:59,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:30:59,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:03,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 22:31:07,264 INFO [train.py:1039] (2/4) Epoch 15, batch 2150, loss[loss=0.1929, simple_loss=0.265, pruned_loss=0.06038, over 23415.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2599, pruned_loss=0.05753, over 4702566.34 frames. ], batch size: 119, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:31:07,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 22:31:07,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:08,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:31:08,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:31:09,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:31:09,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:31:10,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=510133.3333333333, ans=0.0 2023-09-29 22:31:15,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:31:18,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:18,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:20,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:31:20,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:20,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:31:23,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:24,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:31:24,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:31:27,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:27,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 22:31:29,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=510200.0, ans=0.1 2023-09-29 22:31:32,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:34,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:31:36,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:36,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:36,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:38,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:31:38,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:38,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:31:40,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:31:42,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 22:31:43,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:31:43,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:43,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=510266.6666666667, ans=0.0 2023-09-29 22:31:45,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:45,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:31:45,509 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:31:46,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:31:48,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:50,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:31:50,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=510266.6666666667, ans=0.0 2023-09-29 22:31:51,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:51,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 22:31:51,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:31:53,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.75 vs. limit=15.0 2023-09-29 22:31:54,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:56,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:57,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:59,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:31:59,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:00,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:00,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 22:32:02,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 22:32:02,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:32:03,757 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 22:32:03,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:03,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:32:07,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 22:32:07,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:32:07,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 22:32:07,336 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 22:32:07,337 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 22:32:07,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 22:32:11,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:11,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:32:12,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:32:12,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:14,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:32:14,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:14,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:15,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=510400.0, ans=0.125 2023-09-29 22:32:22,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:32:24,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 22:32:25,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:32:28,651 INFO [train.py:1039] (2/4) Epoch 15, batch 2200, loss[loss=0.1876, simple_loss=0.2526, pruned_loss=0.06132, over 23898.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2608, pruned_loss=0.05744, over 4708704.79 frames. ], batch size: 195, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:32:31,712 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.917e+02 2.133e+02 2.422e+02 4.121e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 22:32:31,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:31,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:32:31,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:32:33,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:32:35,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:35,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:32:35,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 22:32:37,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=510466.6666666667, ans=0.1 2023-09-29 22:32:38,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=510466.6666666667, ans=0.1 2023-09-29 22:32:42,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 22:32:45,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:32:48,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=510533.3333333333, ans=0.125 2023-09-29 22:32:49,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 22:32:52,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:53,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:32:54,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:33:00,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:33:00,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 22:33:04,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:33:06,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:06,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 22:33:10,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:33:11,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:12,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=510600.0, ans=0.125 2023-09-29 22:33:14,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:33:14,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:16,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 22:33:18,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:20,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 22:33:22,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:23,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:33:23,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:26,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:33:28,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:28,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:28,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:29,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:33:30,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:33:33,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:33:33,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=510733.3333333333, ans=0.0 2023-09-29 22:33:36,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:33:37,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:33:39,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:33:40,688 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 22:33:42,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:33:42,393 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 22:33:43,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:33:45,321 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 22:33:47,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:47,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:33:50,561 INFO [train.py:1039] (2/4) Epoch 15, batch 2250, loss[loss=0.1989, simple_loss=0.2681, pruned_loss=0.06481, over 23240.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2618, pruned_loss=0.05821, over 4701913.33 frames. ], batch size: 119, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:33:50,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:52,579 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 22:33:54,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:33:56,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:33:57,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=510800.0, ans=0.125 2023-09-29 22:34:01,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:34:03,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:34:03,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=510800.0, ans=0.125 2023-09-29 22:34:07,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:07,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:07,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:09,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:34:09,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:10,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 22:34:11,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:12,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:12,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:34:12,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:13,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 22:34:15,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:34:15,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:17,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:20,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:23,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:25,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:34:25,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:34:26,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 22:34:26,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:30,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:34:32,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=510933.3333333333, ans=0.125 2023-09-29 22:34:34,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:35,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:38,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:34:38,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:40,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=511000.0, ans=0.1 2023-09-29 22:34:41,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:41,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=511000.0, ans=0.1 2023-09-29 22:34:43,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:34:44,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=511000.0, ans=0.0 2023-09-29 22:34:47,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:34:50,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:34:56,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:34:56,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:34:58,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:35:04,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:35:08,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:35:08,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 22:35:08,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:08,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:35:11,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 22:35:12,806 INFO [train.py:1039] (2/4) Epoch 15, batch 2300, loss[loss=0.2032, simple_loss=0.2766, pruned_loss=0.06493, over 23555.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2627, pruned_loss=0.05836, over 4713854.14 frames. ], batch size: 94, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:35:14,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:35:15,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:19,134 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.881e+02 2.170e+02 2.531e+02 3.802e+02, threshold=4.341e+02, percent-clipped=0.0 2023-09-29 22:35:20,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:22,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:35:23,865 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 22:35:25,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:26,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=511133.3333333333, ans=15.0 2023-09-29 22:35:32,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:35:32,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:35:32,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:35:33,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:33,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 22:35:35,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:35:40,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:35:40,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:35:45,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:35:48,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:35:51,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:35:57,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:35:58,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:36:00,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:36:03,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:07,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:36:08,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:36:08,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:36:08,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 22:36:12,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:36:12,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:13,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:13,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:36:13,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:15,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:36:15,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:36:15,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 22:36:15,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:36:15,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:15,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 22:36:21,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:36:25,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:36:30,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:30,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:36:30,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:36:32,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:36:32,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:36:32,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:36:33,774 INFO [train.py:1039] (2/4) Epoch 15, batch 2350, loss[loss=0.2098, simple_loss=0.2822, pruned_loss=0.06866, over 23400.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2635, pruned_loss=0.05882, over 4693777.43 frames. ], batch size: 93, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:36:33,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 22:36:41,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:36:42,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 22:36:47,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=511466.6666666667, ans=0.2 2023-09-29 22:36:48,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 22:36:50,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:55,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:36:55,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:56,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 22:37:01,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:37:07,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 22:37:09,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:37:12,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:37:12,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:37:13,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=511600.0, ans=0.125 2023-09-29 22:37:15,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:37:17,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 22:37:17,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:37:19,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:37:19,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:19,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:37:24,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:37:28,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 22:37:28,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:37:30,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:37:30,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=511666.6666666667, ans=0.0 2023-09-29 22:37:31,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:37:33,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 22:37:33,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:37:36,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 22:37:36,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:37:39,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 22:37:44,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 22:37:44,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:44,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:37:44,387 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 22:37:44,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=511733.3333333333, ans=0.125 2023-09-29 22:37:45,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 22:37:46,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=511733.3333333333, ans=0.2 2023-09-29 22:37:48,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 22:37:52,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:37:56,456 INFO [train.py:1039] (2/4) Epoch 15, batch 2400, loss[loss=0.1817, simple_loss=0.256, pruned_loss=0.05366, over 24459.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2628, pruned_loss=0.05829, over 4698740.55 frames. ], batch size: 63, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:37:58,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:38:01,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:38:01,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.60 vs. limit=22.5 2023-09-29 22:38:03,208 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.881e+02 2.096e+02 2.397e+02 4.111e+02, threshold=4.192e+02, percent-clipped=0.0 2023-09-29 22:38:03,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:38:03,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 22:38:03,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 22:38:07,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.45 vs. limit=15.0 2023-09-29 22:38:12,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:38:12,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:12,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=511866.6666666667, ans=0.025 2023-09-29 22:38:15,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 22:38:15,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:38:17,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:17,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 22:38:23,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:24,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=511866.6666666667, ans=0.1 2023-09-29 22:38:25,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 22:38:30,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:38:37,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 22:38:38,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.11 vs. limit=6.0 2023-09-29 22:38:38,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:38:40,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:43,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:38:43,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 22:38:43,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:38:47,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=512000.0, ans=0.2 2023-09-29 22:38:51,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:53,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:38:56,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:58,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:38:58,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:38:58,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:38:58,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:58,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:00,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:39:05,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:07,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:39:07,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 22:39:08,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 22:39:10,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:39:10,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:39:10,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 22:39:11,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 22:39:11,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 22:39:11,949 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 22:39:13,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 22:39:14,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:39:16,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:16,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:17,954 INFO [train.py:1039] (2/4) Epoch 15, batch 2450, loss[loss=0.1773, simple_loss=0.2575, pruned_loss=0.04857, over 24458.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2609, pruned_loss=0.05806, over 4702915.75 frames. ], batch size: 66, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:39:18,107 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 22:39:18,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:19,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:39:22,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:39:22,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:27,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:27,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:29,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 22:39:31,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=512133.3333333333, ans=0.125 2023-09-29 22:39:33,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=512200.0, ans=0.0 2023-09-29 22:39:34,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:39:34,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:37,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:39:37,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:39:37,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:39:39,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 22:39:43,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:46,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:39:47,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:52,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:39:52,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:54,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:55,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:57,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 22:39:57,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:40:07,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:09,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:40:09,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:09,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:40:11,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:12,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:40:12,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 22:40:14,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:40:15,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=512333.3333333333, ans=0.07 2023-09-29 22:40:15,381 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-09-29 22:40:16,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:40:20,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:40:20,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:24,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:40:24,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 22:40:26,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:40:26,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:40:27,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 22:40:29,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:40:29,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:40:32,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:40:34,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:35,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:40:39,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 22:40:41,385 INFO [train.py:1039] (2/4) Epoch 15, batch 2500, loss[loss=0.1845, simple_loss=0.25, pruned_loss=0.05953, over 23907.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2594, pruned_loss=0.05751, over 4684352.09 frames. ], batch size: 179, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:40:41,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:40:47,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=512466.6666666667, ans=0.07 2023-09-29 22:40:48,499 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.863e+02 2.026e+02 2.249e+02 3.310e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-29 22:40:48,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:40:58,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:40:58,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:41:00,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:41:00,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 22:41:05,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=512533.3333333333, ans=0.1 2023-09-29 22:41:07,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:41:08,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:08,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:41:08,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 22:41:10,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 22:41:11,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:11,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:11,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 22:41:13,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:13,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 22:41:14,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:20,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:41:20,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:22,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:41:24,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 22:41:25,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:41:27,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:29,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=512600.0, ans=0.125 2023-09-29 22:41:32,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:33,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=512666.6666666667, ans=0.125 2023-09-29 22:41:36,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:39,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:45,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:41:46,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 22:41:48,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:48,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:41:50,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:41:50,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:41:50,193 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 22:41:50,194 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 22:41:50,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 22:41:54,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:56,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 22:41:56,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 22:41:57,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:59,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 22:42:03,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 22:42:04,928 INFO [train.py:1039] (2/4) Epoch 15, batch 2550, loss[loss=0.1896, simple_loss=0.272, pruned_loss=0.05364, over 24435.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2593, pruned_loss=0.05748, over 4680713.32 frames. ], batch size: 77, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:42:07,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:10,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:42:10,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:42:13,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:15,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 22:42:15,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:42:19,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 22:42:21,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:42:21,821 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.76 vs. limit=15.0 2023-09-29 22:42:24,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:26,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:42:26,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 22:42:28,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:42:28,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:29,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:42:32,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:42:32,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 22:42:32,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:42:32,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:32,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 22:42:37,318 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.06 vs. limit=22.5 2023-09-29 22:42:45,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:42:51,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:42:51,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:51,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:52,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:42:58,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:42:58,376 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:43:01,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:43:01,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:43:03,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:43:03,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:43:03,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:43:06,415 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.98 vs. limit=6.0 2023-09-29 22:43:07,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:07,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:10,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=513066.6666666667, ans=0.125 2023-09-29 22:43:11,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:43:11,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 22:43:11,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:43:11,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:13,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:43:13,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=513066.6666666667, ans=0.2 2023-09-29 22:43:15,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:43:18,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:18,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=513066.6666666667, ans=0.125 2023-09-29 22:43:23,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:43:24,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:27,862 INFO [train.py:1039] (2/4) Epoch 15, batch 2600, loss[loss=0.2066, simple_loss=0.2703, pruned_loss=0.07148, over 23579.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2605, pruned_loss=0.05772, over 4698487.25 frames. ], batch size: 256, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:43:28,719 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 22:43:32,901 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 22:43:32,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:43:34,983 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.927e+02 2.129e+02 2.377e+02 3.619e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-29 22:43:35,096 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 22:43:35,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 22:43:35,267 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 22:43:39,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:39,737 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 22:43:39,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 22:43:42,025 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 22:43:42,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=513133.3333333333, ans=0.2 2023-09-29 22:43:45,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:43:45,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 22:43:45,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=513200.0, ans=0.125 2023-09-29 22:43:48,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 22:43:49,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:43:49,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 22:43:51,313 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 22:43:51,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 22:44:01,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:01,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:01,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:01,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 22:44:04,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:44:07,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=513266.6666666667, ans=0.125 2023-09-29 22:44:09,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=513266.6666666667, ans=0.0 2023-09-29 22:44:10,585 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 22:44:13,263 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:44:14,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=513266.6666666667, ans=0.125 2023-09-29 22:44:18,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:18,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:19,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 22:44:19,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:19,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:20,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.73 vs. limit=15.0 2023-09-29 22:44:21,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 22:44:24,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:44:24,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:44:26,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,848 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 22:44:29,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:44:30,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=513333.3333333333, ans=0.0 2023-09-29 22:44:33,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=513400.0, ans=0.2 2023-09-29 22:44:36,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:36,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:44:38,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 22:44:38,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:39,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:44:41,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:44:41,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=513400.0, ans=0.125 2023-09-29 22:44:46,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 22:44:48,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:50,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:44:51,260 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.17 vs. limit=22.5 2023-09-29 22:44:51,966 INFO [train.py:1039] (2/4) Epoch 15, batch 2650, loss[loss=0.1825, simple_loss=0.2546, pruned_loss=0.05526, over 24346.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2608, pruned_loss=0.05797, over 4711495.61 frames. ], batch size: 56, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:44:55,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 22:44:55,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:56,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:44:58,129 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 22:44:58,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:00,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.13 vs. limit=22.5 2023-09-29 22:45:01,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:03,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:45:05,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=513466.6666666667, ans=0.2 2023-09-29 22:45:06,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:45:06,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:45:07,128 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.25 vs. limit=10.0 2023-09-29 22:45:07,568 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.22 vs. limit=15.0 2023-09-29 22:45:08,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 22:45:08,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:45:08,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:45:11,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 22:45:13,363 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 22:45:16,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:17,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 22:45:19,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:20,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 22:45:23,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:23,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:45:25,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:25,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:30,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 22:45:31,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 22:45:33,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:45:33,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=513600.0, ans=0.2 2023-09-29 22:45:38,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 22:45:38,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:39,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:39,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:45:40,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:41,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:43,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:44,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:45,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:46,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:45:48,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:45:49,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:49,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:45:51,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:52,084 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.92 vs. limit=15.0 2023-09-29 22:45:52,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:52,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:45:57,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:58,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:45:58,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:58,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 22:46:03,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:06,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:09,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:46:11,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:11,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=513733.3333333333, ans=0.2 2023-09-29 22:46:13,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:14,395 INFO [train.py:1039] (2/4) Epoch 15, batch 2700, loss[loss=0.2034, simple_loss=0.2681, pruned_loss=0.06934, over 23435.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2618, pruned_loss=0.05851, over 4705571.15 frames. ], batch size: 119, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:46:14,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 22:46:16,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:46:17,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 22:46:19,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:46:20,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:20,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:21,305 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.958e+02 2.156e+02 2.389e+02 4.797e+02, threshold=4.312e+02, percent-clipped=1.0 2023-09-29 22:46:21,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:46:21,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:46:22,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:46:23,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:46:23,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 22:46:24,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:46:25,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:46:27,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:46:28,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:34,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:46:34,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 22:46:34,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:46:40,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:46:40,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:46:47,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:46:47,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:49,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:46:49,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:46:50,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:46:52,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=513933.3333333333, ans=0.0 2023-09-29 22:46:53,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:53,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:46:53,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:46:54,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=513933.3333333333, ans=0.0 2023-09-29 22:46:59,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:59,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:47:08,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:47:08,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:47:12,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:47:12,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:17,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:17,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:19,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:47:20,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:22,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:22,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:47:25,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:47:26,273 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.32 vs. limit=6.0 2023-09-29 22:47:28,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:28,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:31,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 22:47:33,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:35,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=514133.3333333333, ans=0.125 2023-09-29 22:47:36,516 INFO [train.py:1039] (2/4) Epoch 15, batch 2750, loss[loss=0.1971, simple_loss=0.2744, pruned_loss=0.05987, over 23372.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2611, pruned_loss=0.05796, over 4707241.90 frames. ], batch size: 93, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:47:36,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:47:36,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 22:47:38,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 22:47:38,435 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:47:40,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:42,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:47:42,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:44,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514133.3333333333, ans=0.1 2023-09-29 22:47:45,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:45,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:47:47,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:50,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:47:50,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:47:51,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:47:51,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:51,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 22:47:51,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:47:51,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=514200.0, ans=0.125 2023-09-29 22:47:53,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:58,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 22:48:00,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:48:01,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:01,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:01,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:48:03,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:48:03,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:48:03,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:03,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=514200.0, ans=0.125 2023-09-29 22:48:04,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:09,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:48:11,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:48:11,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:48:12,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:14,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:48:20,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:23,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:48:23,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:26,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:26,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:48:28,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:48:35,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:48:35,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:35,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 22:48:39,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:42,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 22:48:43,992 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.58 vs. limit=12.0 2023-09-29 22:48:50,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:48:51,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:48:51,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 22:48:53,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:48:56,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:48:56,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 22:48:56,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:48:59,391 INFO [train.py:1039] (2/4) Epoch 15, batch 2800, loss[loss=0.1621, simple_loss=0.2407, pruned_loss=0.04178, over 24636.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2598, pruned_loss=0.05754, over 4705588.90 frames. ], batch size: 60, lr: 6.89e-03, grad_scale: 32.0 2023-09-29 22:48:59,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:48:59,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:00,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:02,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 22:49:02,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:02,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:05,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.774e+02 1.954e+02 2.291e+02 3.351e+02, threshold=3.907e+02, percent-clipped=0.0 2023-09-29 22:49:05,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:07,375 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 22:49:07,376 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 22:49:09,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:10,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:49:10,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:49:15,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=514533.3333333333, ans=0.125 2023-09-29 22:49:15,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:49:16,983 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.57 vs. limit=10.0 2023-09-29 22:49:17,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 22:49:19,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:49:20,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 22:49:22,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:22,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:49:22,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:27,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:28,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:28,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:49:28,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:49:39,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:49:39,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:42,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:42,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:49:44,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:47,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=514600.0, ans=0.2 2023-09-29 22:49:47,123 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=514600.0, ans=0.125 2023-09-29 22:49:48,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:49:48,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 22:49:50,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:51,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:51,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:49:54,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:56,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:00,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:50:00,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514666.6666666667, ans=0.1 2023-09-29 22:50:01,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:50:01,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:01,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:50:01,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:50:03,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:50:04,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:50:04,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 22:50:04,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:05,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:50:05,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:08,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 22:50:10,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:10,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:50:10,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:50:13,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 22:50:18,381 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-09-29 22:50:19,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:50:19,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:50:20,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=514733.3333333333, ans=0.125 2023-09-29 22:50:21,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:50:22,714 INFO [train.py:1039] (2/4) Epoch 15, batch 2850, loss[loss=0.1914, simple_loss=0.268, pruned_loss=0.05743, over 24697.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2593, pruned_loss=0.05715, over 4702509.19 frames. ], batch size: 65, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:50:24,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:27,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:50:27,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:50:29,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:50:31,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:33,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:33,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=514800.0, ans=0.125 2023-09-29 22:50:35,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:50:35,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=514800.0, ans=0.0 2023-09-29 22:50:36,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 22:50:43,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 22:50:43,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:45,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 22:50:45,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:49,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 22:50:49,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 22:50:50,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:57,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=514933.3333333333, ans=0.125 2023-09-29 22:51:03,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:06,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:06,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:51:07,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:51:07,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:51:09,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:51:10,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:51:10,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 22:51:13,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:51:13,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:14,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:15,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:15,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:17,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:18,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:20,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:23,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:51:24,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:24,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:27,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:51:33,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:51:35,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 22:51:35,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 22:51:35,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=515066.6666666667, ans=0.2 2023-09-29 22:51:38,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:51:38,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:38,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 22:51:38,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:51:40,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:40,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:40,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:51:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 22:51:42,914 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 22:51:42,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:51:43,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:46,104 INFO [train.py:1039] (2/4) Epoch 15, batch 2900, loss[loss=0.1681, simple_loss=0.245, pruned_loss=0.04561, over 24306.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2588, pruned_loss=0.05679, over 4709607.32 frames. ], batch size: 61, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:51:46,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=515133.3333333333, ans=0.125 2023-09-29 22:51:49,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:51:49,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:49,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:50,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 22:51:53,876 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.822e+02 2.046e+02 2.406e+02 3.211e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-29 22:51:54,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:54,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 22:51:55,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 22:51:57,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:51:57,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:52:00,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:02,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:52:03,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:52:05,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:52:08,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:52:10,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 22:52:10,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:52:11,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.49 vs. limit=15.0 2023-09-29 22:52:12,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:14,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=515200.0, ans=0.0 2023-09-29 22:52:15,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 22:52:15,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 22:52:20,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:52:20,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 22:52:20,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:52:20,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=515266.6666666667, ans=0.0 2023-09-29 22:52:21,064 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.50 vs. limit=6.0 2023-09-29 22:52:23,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:52:23,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:52:26,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:26,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:28,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=515266.6666666667, ans=0.09899494936611666 2023-09-29 22:52:31,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:52:32,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:52:33,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 22:52:33,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 22:52:33,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:52:38,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:52:38,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=515333.3333333333, ans=0.1 2023-09-29 22:52:40,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 22:52:41,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:52:47,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:56,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:52:56,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:52:58,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 22:53:01,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:01,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 22:53:02,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:02,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:53:07,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.55 vs. limit=15.0 2023-09-29 22:53:07,958 INFO [train.py:1039] (2/4) Epoch 15, batch 2950, loss[loss=0.1717, simple_loss=0.2496, pruned_loss=0.04692, over 24309.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.26, pruned_loss=0.05747, over 4686972.51 frames. ], batch size: 61, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:53:08,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=515466.6666666667, ans=0.1 2023-09-29 22:53:09,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:11,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 22:53:13,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:13,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:14,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:53:17,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:53:17,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 22:53:17,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 22:53:19,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:53:19,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:26,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:27,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:29,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:53:30,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:34,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:53:34,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:53:35,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:53:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 22:53:45,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 22:53:45,688 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 22:53:45,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:53:47,821 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 22:53:49,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 22:53:49,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:50,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:50,879 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 22:53:50,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:53:54,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 22:53:56,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:58,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:53:59,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:01,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:54:02,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:02,954 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 22:54:04,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:04,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 22:54:10,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:10,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:12,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 22:54:12,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:54:14,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 22:54:17,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:20,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:54:20,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:54:22,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:22,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:54:23,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:54:25,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:25,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:54:25,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:54:26,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:27,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:54:28,328 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.40 vs. limit=22.5 2023-09-29 22:54:29,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:29,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 22:54:31,023 INFO [train.py:1039] (2/4) Epoch 15, batch 3000, loss[loss=0.2398, simple_loss=0.2948, pruned_loss=0.09238, over 19307.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2612, pruned_loss=0.0573, over 4694761.06 frames. ], batch size: 388, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:54:31,024 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 22:54:42,005 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([1.9183, 2.4017, 3.2532, 3.0253], device='cuda:2') 2023-09-29 22:54:45,816 INFO [train.py:1071] (2/4) Epoch 15, validation: loss=0.2711, simple_loss=0.2767, pruned_loss=0.1327, over 1125622.00 frames. 2023-09-29 22:54:45,817 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-29 22:54:46,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:51,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:54:51,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:54:53,998 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.966e+02 2.278e+02 2.682e+02 4.156e+02, threshold=4.556e+02, percent-clipped=1.0 2023-09-29 22:54:54,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 22:54:54,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 22:54:57,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:57,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:54:57,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 22:54:58,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:54:59,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=515800.0, ans=0.125 2023-09-29 22:55:06,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:55:06,325 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=515866.6666666667, ans=0.1 2023-09-29 22:55:16,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.62 vs. limit=15.0 2023-09-29 22:55:16,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:55:21,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 22:55:23,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:55:25,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:55:27,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:55:27,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:55:28,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:28,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 22:55:33,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 22:55:33,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:55:35,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:55:37,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:55:37,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:39,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:39,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:55:42,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:55:42,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:42,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:55:44,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=516000.0, ans=0.125 2023-09-29 22:55:45,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:46,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 22:55:48,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:55:48,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:55:48,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:55:51,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:53,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:54,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:55:54,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 22:55:54,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:55:54,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 22:55:56,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:55:58,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 22:56:02,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:02,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 22:56:02,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 22:56:05,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 22:56:05,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:56:07,788 INFO [train.py:1039] (2/4) Epoch 15, batch 3050, loss[loss=0.1908, simple_loss=0.2547, pruned_loss=0.06342, over 23478.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2617, pruned_loss=0.0569, over 4710875.18 frames. ], batch size: 134, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:56:07,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:56:08,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=516133.3333333333, ans=0.0 2023-09-29 22:56:10,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:56:10,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:56:10,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:11,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:56:13,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 22:56:15,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:18,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:18,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:56:21,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:22,784 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.67 vs. limit=8.0 2023-09-29 22:56:24,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 22:56:29,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 22:56:29,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 22:56:31,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:36,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:56:37,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:37,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:39,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:44,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:56:45,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:47,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:47,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:47,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:47,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:50,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:52,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:54,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 22:56:55,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:55,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:56:57,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:58,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:56:58,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:00,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:06,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:57:06,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:07,284 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-09-29 22:57:13,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:14,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:57:14,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:57:16,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:16,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:57:16,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:57:18,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 22:57:19,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:19,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:19,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 22:57:23,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:29,759 INFO [train.py:1039] (2/4) Epoch 15, batch 3100, loss[loss=0.1989, simple_loss=0.2659, pruned_loss=0.06591, over 23407.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2623, pruned_loss=0.05694, over 4715134.51 frames. ], batch size: 119, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:57:29,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:31,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:57:33,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:57:34,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 22:57:37,780 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.874e+02 2.072e+02 2.284e+02 2.890e+02, threshold=4.143e+02, percent-clipped=0.0 2023-09-29 22:57:37,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 22:57:40,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 22:57:40,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:57:44,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:46,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:47,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:57:54,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:59,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 22:58:05,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:58:05,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:05,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:07,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:08,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:58:10,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:58:10,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 22:58:10,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:58:10,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:11,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 22:58:13,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:58:16,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:58:17,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 22:58:20,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 22:58:21,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:21,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:23,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:23,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:23,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:58:26,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:58:26,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:58:29,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:58:29,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:58:29,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:29,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 22:58:34,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:34,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=516733.3333333333, ans=0.125 2023-09-29 22:58:35,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 22:58:37,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:58:38,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 22:58:39,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=516733.3333333333, ans=0.05 2023-09-29 22:58:40,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:40,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:41,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 22:58:50,877 INFO [train.py:1039] (2/4) Epoch 15, batch 3150, loss[loss=0.1792, simple_loss=0.2457, pruned_loss=0.05636, over 23896.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2611, pruned_loss=0.05641, over 4714806.99 frames. ], batch size: 195, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:58:51,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 22:58:52,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:58:53,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=516800.0, ans=0.0 2023-09-29 22:58:54,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:56,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:56,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:58:58,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 22:59:00,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:00,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:59:00,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 22:59:03,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:04,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=516800.0, ans=0.05 2023-09-29 22:59:05,981 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 22:59:10,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 22:59:10,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:11,715 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 22:59:11,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:59:13,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 22:59:14,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 22:59:14,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 22:59:14,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:14,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:16,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:19,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 22:59:19,617 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=516866.6666666667, ans=0.1 2023-09-29 22:59:20,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:21,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.02 vs. limit=15.0 2023-09-29 22:59:22,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:22,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:23,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:59:27,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 22:59:27,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:59:29,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:59:31,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:31,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 22:59:34,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 22:59:34,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:59:36,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:59:36,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:59:37,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:37,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:59:38,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:59:38,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:59:40,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 22:59:40,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:59:40,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:43,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:59:43,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:44,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 22:59:44,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:46,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 22:59:46,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:46,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=517000.0, ans=0.2 2023-09-29 22:59:47,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 22:59:49,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 22:59:49,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:59:51,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:51,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 22:59:52,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:59:52,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:57,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:58,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:58,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:59:58,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=517066.6666666667, ans=0.1 2023-09-29 23:00:05,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:00:05,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:09,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 23:00:11,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=517066.6666666667, ans=0.2 2023-09-29 23:00:12,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:00:12,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 23:00:14,361 INFO [train.py:1039] (2/4) Epoch 15, batch 3200, loss[loss=0.16, simple_loss=0.2351, pruned_loss=0.04246, over 20164.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2592, pruned_loss=0.05609, over 4701119.83 frames. ], batch size: 44, lr: 6.87e-03, grad_scale: 32.0 2023-09-29 23:00:16,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:17,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:00:17,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 23:00:20,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:00:22,256 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.845e+02 1.990e+02 2.356e+02 4.554e+02, threshold=3.981e+02, percent-clipped=2.0 2023-09-29 23:00:23,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:00:28,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:36,467 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:00:37,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:00:49,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 23:00:52,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:00:55,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 23:00:56,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:00:56,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=517266.6666666667, ans=0.125 2023-09-29 23:01:01,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:01:01,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:01:03,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:01:04,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 23:01:07,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 23:01:07,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 23:01:12,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 23:01:15,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:01:17,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=517333.3333333333, ans=0.0 2023-09-29 23:01:20,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=517400.0, ans=0.125 2023-09-29 23:01:22,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:22,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:01:22,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:23,802 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 23:01:23,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:01:27,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:27,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 23:01:28,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 23:01:28,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 23:01:30,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 23:01:32,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:01:33,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:01:35,046 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 23:01:35,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:01:35,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:01:36,524 INFO [train.py:1039] (2/4) Epoch 15, batch 3250, loss[loss=0.185, simple_loss=0.26, pruned_loss=0.05503, over 23227.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2596, pruned_loss=0.0558, over 4710638.19 frames. ], batch size: 93, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:01:36,694 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 23:01:38,418 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=517466.6666666667, ans=0.125 2023-09-29 23:01:43,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:01:45,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:01:54,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:01:54,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 23:01:54,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:55,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:55,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:01:57,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:01:57,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:02:00,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:00,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:02:01,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:01,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:03,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:06,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:02:09,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:09,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:11,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:11,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:02:11,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:16,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 23:02:16,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=517600.0, ans=0.0 2023-09-29 23:02:16,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=517600.0, ans=0.1 2023-09-29 23:02:18,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:02:18,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:02:20,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:20,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:02:28,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:02:30,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=517666.6666666667, ans=0.125 2023-09-29 23:02:30,738 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-09-29 23:02:38,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:38,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:38,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 23:02:38,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:02:38,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:02:39,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:41,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 23:02:42,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 23:02:42,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:44,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:44,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:45,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:02:45,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:46,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=517733.3333333333, ans=0.125 2023-09-29 23:02:49,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:49,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:02:51,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 23:02:51,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:02:53,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:02:53,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 23:02:58,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:58,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 23:02:59,919 INFO [train.py:1039] (2/4) Epoch 15, batch 3300, loss[loss=0.1842, simple_loss=0.2537, pruned_loss=0.05734, over 23606.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2601, pruned_loss=0.05641, over 4701071.11 frames. ], batch size: 232, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:03:02,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 23:03:03,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 23:03:03,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:08,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:03:09,621 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.939e+02 2.168e+02 2.538e+02 3.579e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 23:03:09,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:03:09,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:11,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:03:11,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:03:16,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:18,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:03:22,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 23:03:22,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:22,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:23,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:23,835 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 23:03:25,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:03:26,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:03:28,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:03:28,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:03:28,278 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 23:03:28,979 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.46 vs. limit=15.0 2023-09-29 23:03:32,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:32,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:03:34,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:34,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 23:03:36,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:03:36,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:38,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:03:41,188 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 23:03:41,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=517933.3333333333, ans=0.04949747468305833 2023-09-29 23:03:42,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 23:03:44,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:03:47,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 23:03:48,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:03:50,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:03:50,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:03:52,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:53,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:53,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:53,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:03:55,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:03:55,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:56,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:03:59,841 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 23:04:00,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=518000.0, ans=0.0 2023-09-29 23:04:00,653 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.56 vs. limit=22.5 2023-09-29 23:04:01,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 23:04:03,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:04:04,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:04,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:07,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:04:07,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:11,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:04:11,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:11,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:04:12,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:04:14,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:04:15,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 23:04:17,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:18,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:20,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:04:20,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:04:21,764 INFO [train.py:1039] (2/4) Epoch 15, batch 3350, loss[loss=0.1707, simple_loss=0.2586, pruned_loss=0.04134, over 24676.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2615, pruned_loss=0.05712, over 4702325.77 frames. ], batch size: 73, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:04:21,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:24,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:24,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:26,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:04:29,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:30,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:04:32,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:35,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:04:36,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.83 vs. limit=15.0 2023-09-29 23:04:37,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:39,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:04:40,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 23:04:42,698 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 23:04:42,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:47,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 23:04:47,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 23:04:48,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:04:48,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:04:50,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:50,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 23:04:50,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:51,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:04:52,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:54,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:55,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:56,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:04:59,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:04:59,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:01,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:05,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:05:07,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:10,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:10,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:11,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:15,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 23:05:15,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:05:15,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 23:05:15,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:05:17,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 23:05:20,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:21,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:28,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:29,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 23:05:31,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:05:31,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:05:32,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:05:39,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:40,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 23:05:42,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:05:42,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:05:42,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:43,755 INFO [train.py:1039] (2/4) Epoch 15, batch 3400, loss[loss=0.2045, simple_loss=0.2923, pruned_loss=0.0583, over 24641.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2629, pruned_loss=0.05753, over 4713850.80 frames. ], batch size: 73, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:05:43,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 23:05:43,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:43,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 23:05:46,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:46,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:46,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:05:48,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:05:48,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 23:05:54,290 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.906e+02 2.208e+02 2.568e+02 3.814e+02, threshold=4.417e+02, percent-clipped=0.0 2023-09-29 23:05:54,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 23:05:54,421 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 23:05:54,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:59,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:59,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:06:00,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:02,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:06:02,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=518533.3333333333, ans=0.0 2023-09-29 23:06:06,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:09,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 23:06:14,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:06:16,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:16,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:16,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:06:26,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:06:31,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 23:06:37,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:38,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:39,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 23:06:39,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:06:40,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:41,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:42,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:06:46,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:48,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:06:48,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:06:54,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:06:56,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 23:07:02,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:07:05,196 INFO [train.py:1039] (2/4) Epoch 15, batch 3450, loss[loss=0.2029, simple_loss=0.2827, pruned_loss=0.0616, over 24414.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2631, pruned_loss=0.05778, over 4717600.00 frames. ], batch size: 77, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:07:05,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 23:07:09,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 23:07:10,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:12,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.63 vs. limit=12.0 2023-09-29 23:07:13,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:07:13,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 23:07:13,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:07:16,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:07:18,703 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.71 vs. limit=15.0 2023-09-29 23:07:20,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-09-29 23:07:20,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:07:22,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:23,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:07:23,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:25,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=518866.6666666667, ans=0.125 2023-09-29 23:07:26,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:33,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 23:07:39,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 23:07:39,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:07:39,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:07:40,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:46,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 23:07:48,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:07:53,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:07:53,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:53,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:07:54,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:07:57,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 23:07:57,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:07:59,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:59,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=519000.0, ans=0.125 2023-09-29 23:08:02,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:04,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 23:08:10,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:08:16,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:08:17,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:19,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:21,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:22,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:08:23,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:08:23,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:08:25,821 INFO [train.py:1039] (2/4) Epoch 15, batch 3500, loss[loss=0.177, simple_loss=0.2473, pruned_loss=0.05333, over 23597.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2615, pruned_loss=0.05761, over 4713386.41 frames. ], batch size: 120, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:08:28,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:32,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:08:33,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 23:08:35,732 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.112e+02 2.557e+02 4.010e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:08:35,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:08:39,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:08:41,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=519200.0, ans=0.0 2023-09-29 23:08:41,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=519200.0, ans=0.0 2023-09-29 23:08:43,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:43,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 23:08:47,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:08:49,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:50,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:08:50,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:08:50,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:08:51,150 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:08:52,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:52,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:08:52,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 23:08:52,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=519200.0, ans=0.0 2023-09-29 23:08:52,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=519200.0, ans=0.0 2023-09-29 23:08:54,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:55,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:08:55,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:09:00,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:01,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 23:09:01,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:09:04,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:09:06,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:09:08,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:11,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:09:11,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:12,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 23:09:15,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 23:09:15,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 23:09:16,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:18,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:18,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:19,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:09:21,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:09:21,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:09:26,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:09:27,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 23:09:27,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 23:09:27,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:09:28,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=519333.3333333333, ans=0.2 2023-09-29 23:09:30,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:32,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:33,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:37,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 23:09:38,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:38,826 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=519400.0, ans=0.0 2023-09-29 23:09:40,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:42,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 23:09:43,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 23:09:46,116 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=519466.6666666667, ans=0.2 2023-09-29 23:09:47,183 INFO [train.py:1039] (2/4) Epoch 15, batch 3550, loss[loss=0.1763, simple_loss=0.2611, pruned_loss=0.04582, over 24267.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2592, pruned_loss=0.05629, over 4710147.90 frames. ], batch size: 74, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:09:47,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:47,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:47,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=519466.6666666667, ans=0.1 2023-09-29 23:09:48,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:09:48,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:09:54,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:10:03,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:05,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 23:10:08,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:09,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:10:11,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:12,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:10:12,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:10:15,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:15,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:10:18,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:18,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:10:18,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:10:25,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:10:25,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:27,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:27,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:29,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:10:29,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 23:10:29,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:29,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=519600.0, ans=0.0 2023-09-29 23:10:31,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:32,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:10:38,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:38,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:40,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:41,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 23:10:41,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:10:43,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 23:10:44,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:46,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:10:47,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:10:49,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 23:10:51,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:00,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:00,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 23:11:01,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:03,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:11:03,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=519733.3333333333, ans=0.1 2023-09-29 23:11:03,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=519733.3333333333, ans=0.0 2023-09-29 23:11:04,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.47 vs. limit=22.5 2023-09-29 23:11:05,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 23:11:09,403 INFO [train.py:1039] (2/4) Epoch 15, batch 3600, loss[loss=0.2123, simple_loss=0.2729, pruned_loss=0.07585, over 22760.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2597, pruned_loss=0.0564, over 4723857.26 frames. ], batch size: 322, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:11:12,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 23:11:12,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:11:12,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=519800.0, ans=0.125 2023-09-29 23:11:14,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:11:15,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:11:20,429 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.241e+02 2.559e+02 3.675e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-29 23:11:20,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:22,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:23,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:11:25,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:11:25,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:25,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 23:11:29,878 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=519866.6666666667, ans=0.125 2023-09-29 23:11:30,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:11:33,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:35,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:38,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:38,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:11:39,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=519866.6666666667, ans=0.1 2023-09-29 23:11:40,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:40,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 23:11:40,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:42,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=519933.3333333333, ans=0.2 2023-09-29 23:11:43,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:43,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:11:45,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:48,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:49,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:11:51,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 23:11:57,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:00,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:12:00,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 23:12:02,428 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=520000.0, ans=0.0 2023-09-29 23:12:07,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:12:13,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:16,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:22,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:12:22,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:12:23,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 23:12:24,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 23:12:26,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 23:12:27,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:12:29,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:12:30,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 23:12:30,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:12:30,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:12:30,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:32,129 INFO [train.py:1039] (2/4) Epoch 15, batch 3650, loss[loss=0.1679, simple_loss=0.2443, pruned_loss=0.04577, over 24524.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2601, pruned_loss=0.0563, over 4732773.83 frames. ], batch size: 63, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:12:32,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 23:12:33,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 23:12:38,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:39,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 23:12:43,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 23:12:44,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:12:48,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 23:12:50,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 23:12:54,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:12:54,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:12:54,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:12:55,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=520200.0, ans=0.95 2023-09-29 23:12:59,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:12:59,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:59,864 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:13:01,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 23:13:01,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:13:02,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:02,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 23:13:04,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:13:05,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:05,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:07,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:13:11,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 23:13:13,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 23:13:14,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:13:16,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 23:13:18,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:18,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:13:23,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:13:23,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=520333.3333333333, ans=0.0 2023-09-29 23:13:25,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:26,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:13:28,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:13:28,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:13:31,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:13:34,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:35,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:35,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:36,109 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=520333.3333333333, ans=0.1 2023-09-29 23:13:37,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:13:37,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:37,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=520400.0, ans=0.0 2023-09-29 23:13:39,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:47,147 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 23:13:50,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:50,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:53,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:13:53,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:13:53,422 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=520400.0, ans=0.125 2023-09-29 23:13:53,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=520400.0, ans=0.125 2023-09-29 23:13:54,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:13:55,832 INFO [train.py:1039] (2/4) Epoch 15, batch 3700, loss[loss=0.185, simple_loss=0.2488, pruned_loss=0.06056, over 23679.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.261, pruned_loss=0.05701, over 4735133.45 frames. ], batch size: 149, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:13:57,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:57,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 23:13:58,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:02,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:14:04,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:14:04,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:14:07,295 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.795e+02 1.978e+02 2.320e+02 3.492e+02, threshold=3.956e+02, percent-clipped=0.0 2023-09-29 23:14:07,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:07,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 23:14:07,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:09,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:14:09,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:14:10,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:14:13,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:14:15,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:17,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:14:17,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:17,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=520533.3333333333, ans=0.125 2023-09-29 23:14:18,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:14:20,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:21,253 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.26 vs. limit=22.5 2023-09-29 23:14:23,348 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 23:14:30,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:14:30,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:14:31,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:14:32,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 23:14:32,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:37,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:39,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 23:14:39,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:40,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:14:43,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:43,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:14:46,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:14:48,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=520666.6666666667, ans=0.2 2023-09-29 23:14:53,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:53,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 23:14:53,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:53,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 23:14:58,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:14:58,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:15:02,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:04,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 23:15:05,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:15:05,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:15:05,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:05,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:09,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:11,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 23:15:12,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 23:15:13,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:15:13,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:15,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:15:17,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:15:18,590 INFO [train.py:1039] (2/4) Epoch 15, batch 3750, loss[loss=0.1885, simple_loss=0.2723, pruned_loss=0.05229, over 24438.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2615, pruned_loss=0.05799, over 4711920.67 frames. ], batch size: 69, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:15:20,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:15:21,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:15:23,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:15:25,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 23:15:25,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:15:28,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:15:28,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 23:15:28,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:15:30,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:31,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:34,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:15:39,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:43,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:15:43,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:15:46,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:49,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:15:50,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 23:15:50,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:15:50,902 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:15:52,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:15:53,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:58,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 23:15:59,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=520933.3333333333, ans=0.1 2023-09-29 23:16:00,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 23:16:02,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:16:02,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:16:05,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:10,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:10,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=521000.0, ans=0.125 2023-09-29 23:16:11,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:16:12,597 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-09-29 23:16:17,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 23:16:19,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:22,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:16:23,341 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.36 vs. limit=22.5 2023-09-29 23:16:23,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:16:25,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:16:29,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:16:30,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:16:33,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:16:35,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:16:36,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:16:42,049 INFO [train.py:1039] (2/4) Epoch 15, batch 3800, loss[loss=0.1757, simple_loss=0.2644, pruned_loss=0.04353, over 24634.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2616, pruned_loss=0.05836, over 4707857.68 frames. ], batch size: 68, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:16:48,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:16:52,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:53,956 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.926e+02 2.197e+02 2.572e+02 3.793e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 23:16:54,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:16:55,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 23:16:57,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:58,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:16:58,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:16:59,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=521200.0, ans=0.0 2023-09-29 23:17:02,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:17:02,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:03,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:17:05,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:17:05,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:17:05,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:07,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 23:17:09,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 23:17:11,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:17:14,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:16,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:17:16,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:17:20,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:17:20,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:22,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:23,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:23,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=521266.6666666667, ans=0.125 2023-09-29 23:17:28,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:17:28,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 23:17:32,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:37,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:17:42,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:17:42,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=521333.3333333333, ans=0.125 2023-09-29 23:17:43,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 23:17:45,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 23:17:45,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:48,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:50,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:52,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 23:17:56,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 23:17:56,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 23:17:56,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:58,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:18:02,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:18:04,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:18:05,914 INFO [train.py:1039] (2/4) Epoch 15, batch 3850, loss[loss=0.1791, simple_loss=0.2601, pruned_loss=0.04909, over 24668.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2602, pruned_loss=0.05776, over 4699021.92 frames. ], batch size: 65, lr: 6.85e-03, grad_scale: 8.0 2023-09-29 23:18:07,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=521466.6666666667, ans=0.125 2023-09-29 23:18:08,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:18:09,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 23:18:11,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:18:11,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:14,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:18:17,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:20,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:18:22,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 23:18:29,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:32,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:34,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:34,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:18:37,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:38,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:18:40,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:42,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:18:42,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:43,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:45,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:45,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:18:46,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 23:18:46,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 23:18:46,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:47,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:50,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:51,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:51,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 23:18:53,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 23:18:56,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:57,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 23:18:58,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:19:01,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=521666.6666666667, ans=0.125 2023-09-29 23:19:02,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=521666.6666666667, ans=0.0 2023-09-29 23:19:04,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:06,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:19:10,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:10,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 23:19:14,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 23:19:17,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:17,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:22,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:19:22,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:19:22,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:22,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=521733.3333333333, ans=0.125 2023-09-29 23:19:23,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:23,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:19:23,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 23:19:25,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:19:25,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 23:19:26,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:27,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:28,417 INFO [train.py:1039] (2/4) Epoch 15, batch 3900, loss[loss=0.1888, simple_loss=0.2623, pruned_loss=0.05759, over 23209.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2588, pruned_loss=0.05715, over 4704876.02 frames. ], batch size: 105, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:19:30,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:19:30,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:32,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:19:32,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:32,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:33,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:33,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 23:19:33,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:38,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:40,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:40,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:19:41,905 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.819e+02 2.036e+02 2.423e+02 3.835e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-29 23:19:42,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:45,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:45,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:47,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:19:49,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 23:19:49,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:19:50,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 23:19:50,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:50,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 23:19:52,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 23:19:57,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:19:58,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:58,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:20:00,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:05,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:20:05,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=521933.3333333333, ans=0.0 2023-09-29 23:20:06,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:20:09,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:20:09,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:10,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:20:18,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:18,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:20:25,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:20:27,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=522000.0, ans=0.07 2023-09-29 23:20:27,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=522000.0, ans=0.0 2023-09-29 23:20:28,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:20:34,363 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-09-29 23:20:39,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:20:41,848 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=522066.6666666667, ans=0.1 2023-09-29 23:20:43,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:43,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 23:20:44,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 23:20:44,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:46,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 23:20:46,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=522066.6666666667, ans=0.1 2023-09-29 23:20:47,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:49,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 23:20:51,208 INFO [train.py:1039] (2/4) Epoch 15, batch 3950, loss[loss=0.2096, simple_loss=0.2692, pruned_loss=0.07505, over 23372.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2589, pruned_loss=0.05655, over 4712616.08 frames. ], batch size: 285, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:20:56,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:58,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 23:20:58,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:21:01,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:21:03,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:21:08,260 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 23:21:08,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:10,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 23:21:10,330 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 23:21:10,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:13,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:14,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:21:14,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:16,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 23:21:19,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:21:19,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:19,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:21:21,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:21:21,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:21:33,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:21:33,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:21:37,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=522266.6666666667, ans=0.0 2023-09-29 23:21:38,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=522266.6666666667, ans=0.2 2023-09-29 23:21:41,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 23:21:47,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 23:21:47,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 23:21:47,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:21:49,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:21:50,655 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.16 vs. limit=8.0 2023-09-29 23:21:53,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=522333.3333333333, ans=0.0 2023-09-29 23:21:54,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:21:54,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:21:56,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:56,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:21:56,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 23:22:03,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:22:04,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:22:05,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=522400.0, ans=0.125 2023-09-29 23:22:07,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 23:22:12,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=522400.0, ans=0.0 2023-09-29 23:22:15,387 INFO [train.py:1039] (2/4) Epoch 15, batch 4000, loss[loss=0.1898, simple_loss=0.2577, pruned_loss=0.06098, over 23466.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2599, pruned_loss=0.05688, over 4716660.50 frames. ], batch size: 285, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:22:17,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:24,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:28,374 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.844e+02 2.082e+02 2.375e+02 3.458e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 23:22:28,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:30,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:22:32,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:32,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 23:22:33,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:22:35,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 23:22:35,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:22:35,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 23:22:36,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:40,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:22:40,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:22:40,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:22:41,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-09-29 23:22:42,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:22:42,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:22:44,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:22:47,001 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 23:22:48,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:22:48,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:22:51,637 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 23:22:53,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:22:53,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:22:59,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 23:23:01,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:23:01,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=522600.0, ans=0.125 2023-09-29 23:23:03,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:23:04,836 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 23:23:04,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:23:06,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 23:23:06,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:23:06,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:06,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=522666.6666666667, ans=0.2 2023-09-29 23:23:08,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:23:08,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=522666.6666666667, ans=0.125 2023-09-29 23:23:10,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=522666.6666666667, ans=0.0 2023-09-29 23:23:11,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:23:11,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:23:11,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:23:13,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 23:23:13,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:15,018 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 23:23:21,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:23:24,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:23:25,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:23:27,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:28,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:23:29,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:23:34,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:37,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:23:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 23:23:38,533 INFO [train.py:1039] (2/4) Epoch 15, batch 4050, loss[loss=0.1939, simple_loss=0.2658, pruned_loss=0.06096, over 23808.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2609, pruned_loss=0.05697, over 4711941.32 frames. ], batch size: 179, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:23:38,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:23:38,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:23:40,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:23:42,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:23:42,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:47,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:50,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:23:50,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:23:54,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:23:55,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:23:56,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=522866.6666666667, ans=0.0 2023-09-29 23:24:00,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:00,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=522866.6666666667, ans=0.125 2023-09-29 23:24:02,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:24:07,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 23:24:08,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 23:24:10,087 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 23:24:11,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:24:13,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=522933.3333333333, ans=0.0 2023-09-29 23:24:19,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 23:24:19,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:24,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:27,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:27,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:24:27,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:33,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:24:36,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 23:24:36,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:24:38,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:40,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 23:24:43,665 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-09-29 23:24:44,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:45,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=523066.6666666667, ans=0.125 2023-09-29 23:24:50,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 23:24:53,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:53,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:24:55,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 23:24:55,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 23:24:55,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:24:57,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:24:59,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:24:59,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:25:00,920 INFO [train.py:1039] (2/4) Epoch 15, batch 4100, loss[loss=0.1906, simple_loss=0.2612, pruned_loss=0.06004, over 24462.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2617, pruned_loss=0.05779, over 4712163.18 frames. ], batch size: 58, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:25:06,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 23:25:08,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 23:25:10,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 23:25:12,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 23:25:12,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:13,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:25:15,150 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 23:25:16,441 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.960e+02 2.243e+02 2.866e+02 4.978e+02, threshold=4.486e+02, percent-clipped=4.0 2023-09-29 23:25:18,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:18,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:25:18,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:19,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.83 vs. limit=15.0 2023-09-29 23:25:19,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:25:22,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:25:24,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:24,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:25:24,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 23:25:24,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:25,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:25:25,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:25,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:25:26,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 23:25:28,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=523200.0, ans=0.125 2023-09-29 23:25:29,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:31,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 23:25:33,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:25:34,169 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=15.0 2023-09-29 23:25:36,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:36,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 23:25:38,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:25:40,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:25:40,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:25:41,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 23:25:43,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:25:43,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:25:47,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 23:25:48,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:48,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:25:51,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:55,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=523333.3333333333, ans=0.125 2023-09-29 23:25:58,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:01,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:02,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:26:13,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:13,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:26:15,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=523400.0, ans=0.125 2023-09-29 23:26:18,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:19,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:26:23,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:26:24,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.38 vs. limit=15.0 2023-09-29 23:26:25,106 INFO [train.py:1039] (2/4) Epoch 15, batch 4150, loss[loss=0.1862, simple_loss=0.272, pruned_loss=0.05022, over 24454.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2623, pruned_loss=0.05831, over 4710270.65 frames. ], batch size: 69, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:26:26,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:26:26,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:26:26,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:28,607 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=523466.6666666667, ans=0.125 2023-09-29 23:26:29,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 23:26:29,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:31,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 23:26:31,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 23:26:32,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 23:26:34,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:41,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:26:41,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:45,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:26:47,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:26:47,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:26:50,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:26:51,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:53,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:26:55,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=523600.0, ans=0.05 2023-09-29 23:26:57,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:01,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:03,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 23:27:05,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 23:27:05,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:27:07,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 23:27:07,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:27:07,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:08,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:08,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=523600.0, ans=0.125 2023-09-29 23:27:10,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:13,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 23:27:16,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:16,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=523666.6666666667, ans=0.125 2023-09-29 23:27:19,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:27:19,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 23:27:19,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:20,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 23:27:23,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:27:26,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:27,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:29,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 23:27:29,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:27:29,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:27:30,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:27:32,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 23:27:34,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:34,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:27:34,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:27:34,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 23:27:35,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:36,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:27:36,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:38,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:39,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 23:27:39,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=523733.3333333333, ans=0.125 2023-09-29 23:27:40,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:46,935 INFO [train.py:1039] (2/4) Epoch 15, batch 4200, loss[loss=0.1991, simple_loss=0.258, pruned_loss=0.07013, over 23798.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2609, pruned_loss=0.05811, over 4698732.12 frames. ], batch size: 212, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:27:47,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:27:49,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 23:27:50,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=523800.0, ans=0.125 2023-09-29 23:27:52,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:27:52,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=523800.0, ans=0.0 2023-09-29 23:27:53,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=523800.0, ans=0.2 2023-09-29 23:27:55,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:27:55,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:27:56,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:56,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:58,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 23:28:01,693 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.915e+02 2.061e+02 2.276e+02 4.406e+02, threshold=4.122e+02, percent-clipped=0.0 2023-09-29 23:28:02,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 23:28:02,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:05,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:06,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=523866.6666666667, ans=0.1 2023-09-29 23:28:08,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:28:09,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:28:09,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=523866.6666666667, ans=0.125 2023-09-29 23:28:11,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:11,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:13,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 23:28:13,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:14,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:15,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:28:15,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:28:16,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:28:21,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 23:28:21,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:26,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:28:28,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:28:29,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:28:31,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:28:33,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:28:33,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 23:28:33,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:35,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:28:35,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=524000.0, ans=0.2 2023-09-29 23:28:40,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:28:43,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:49,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:28:52,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 23:28:55,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:59,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:29:01,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:02,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 23:29:07,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:29:09,499 INFO [train.py:1039] (2/4) Epoch 15, batch 4250, loss[loss=0.1887, simple_loss=0.2543, pruned_loss=0.06155, over 23820.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2604, pruned_loss=0.05767, over 4706191.66 frames. ], batch size: 164, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:29:12,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:29:12,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:29:14,353 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:29:15,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:20,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:29:20,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 23:29:21,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=524133.3333333333, ans=0.125 2023-09-29 23:29:22,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:29:24,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:27,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:34,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:34,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:34,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:29:34,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:29:36,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:37,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:39,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:42,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:29:42,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=524266.6666666667, ans=0.125 2023-09-29 23:29:44,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:29:45,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 23:29:48,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 23:29:48,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:49,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:49,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:50,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:29:50,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:52,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:55,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:29:57,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:30:02,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:04,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:06,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 23:30:06,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:30:06,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 23:30:07,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:30:09,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:30:10,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:10,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:30:12,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 23:30:14,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:30:15,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:30:17,861 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:30:20,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:23,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:25,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:30:26,330 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.37 vs. limit=6.0 2023-09-29 23:30:27,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:28,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:30,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:30:30,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:30:30,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 23:30:31,789 INFO [train.py:1039] (2/4) Epoch 15, batch 4300, loss[loss=0.1825, simple_loss=0.2563, pruned_loss=0.05435, over 23458.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2592, pruned_loss=0.05708, over 4700477.51 frames. ], batch size: 93, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:30:32,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:36,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:38,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:30:41,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:47,064 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.892e+02 2.099e+02 2.369e+02 3.970e+02, threshold=4.198e+02, percent-clipped=0.0 2023-09-29 23:30:50,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:50,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 23:30:51,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:30:53,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:30:55,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:30:55,323 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 23:30:58,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:30:59,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:05,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 23:31:05,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:31:05,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 23:31:08,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:31:10,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:31:14,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:31:14,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:31:14,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:31:15,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:16,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:31:16,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 23:31:19,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 23:31:20,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:31:22,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:22,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:31:22,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:22,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=524666.6666666666, ans=0.125 2023-09-29 23:31:23,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:23,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 23:31:23,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 23:31:23,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 23:31:26,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:31:26,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 23:31:26,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 23:31:29,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=524666.6666666666, ans=0.125 2023-09-29 23:31:31,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.44 vs. limit=10.0 2023-09-29 23:31:32,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:33,649 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 23:31:35,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:31:37,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:37,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:39,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 23:31:39,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:39,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:41,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:31:42,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:42,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:31:43,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=524733.3333333334, ans=0.125 2023-09-29 23:31:45,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:31:47,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:48,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:48,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=524733.3333333334, ans=0.125 2023-09-29 23:31:49,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:55,392 INFO [train.py:1039] (2/4) Epoch 15, batch 4350, loss[loss=0.1714, simple_loss=0.2491, pruned_loss=0.04685, over 24314.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2602, pruned_loss=0.05729, over 4706278.87 frames. ], batch size: 56, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:31:55,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 23:31:55,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:32:03,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:06,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:09,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:32:09,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:32:13,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:32:18,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:21,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:32:21,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:32:24,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.89 vs. limit=15.0 2023-09-29 23:32:25,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:32:27,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:32:28,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:32:35,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 23:32:36,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:37,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:39,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=524933.3333333334, ans=0.125 2023-09-29 23:32:40,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:44,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 23:32:45,058 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.92 vs. limit=12.0 2023-09-29 23:32:49,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:32:52,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:32:57,649 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 23:32:59,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:32:59,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:32:59,823 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 23:33:01,335 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 23:33:01,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:01,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:02,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:33:02,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:04,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:05,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:07,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 23:33:08,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:08,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:08,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:10,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 23:33:10,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=525066.6666666666, ans=0.1 2023-09-29 23:33:11,960 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 23:33:11,967 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 23:33:11,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 23:33:15,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:33:16,460 INFO [train.py:1039] (2/4) Epoch 15, batch 4400, loss[loss=0.1667, simple_loss=0.251, pruned_loss=0.04122, over 24673.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2606, pruned_loss=0.0578, over 4694610.72 frames. ], batch size: 65, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:33:16,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:33:16,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:17,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:33:19,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 23:33:21,169 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 23:33:21,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:26,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:27,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:29,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:30,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 23:33:30,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 23:33:32,666 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.997e+02 2.183e+02 2.511e+02 3.955e+02, threshold=4.366e+02, percent-clipped=0.0 2023-09-29 23:33:32,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 23:33:32,865 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 23:33:34,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:33:34,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:35,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 23:33:38,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:40,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:40,554 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 23:33:43,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:43,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 23:33:43,774 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 23:33:46,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 23:33:46,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 23:33:47,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 23:33:48,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:48,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:53,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 23:33:53,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 23:33:53,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=525266.6666666666, ans=0.125 2023-09-29 23:33:54,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:56,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:33:56,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:58,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:58,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:58,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 23:34:00,480 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 23:34:02,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:11,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:34:12,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 23:34:16,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:34:17,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:20,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:34:20,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 23:34:20,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:34:20,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:34:20,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:34:21,623 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.62 vs. limit=22.5 2023-09-29 23:34:22,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:34:26,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 23:34:31,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 23:34:32,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 23:34:32,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:34:32,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 23:34:32,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:34:37,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:34:38,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=525400.0, ans=0.125 2023-09-29 23:34:40,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 23:34:41,760 INFO [train.py:1039] (2/4) Epoch 15, batch 4450, loss[loss=0.26, simple_loss=0.309, pruned_loss=0.1055, over 19568.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2613, pruned_loss=0.05814, over 4689548.05 frames. ], batch size: 388, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:34:43,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:45,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:46,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:34:48,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=525466.6666666666, ans=0.2 2023-09-29 23:34:52,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:34:52,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:34:55,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:58,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:35:01,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:35:01,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:02,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 23:35:02,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:02,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:04,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:04,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:35:08,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:35:08,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=525533.3333333334, ans=0.1 2023-09-29 23:35:13,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:14,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:16,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:18,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:18,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:35:22,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:35:25,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 23:35:25,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 23:35:25,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:35:28,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:30,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 23:35:33,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=525666.6666666666, ans=22.5 2023-09-29 23:35:34,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:35:38,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:39,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 23:35:39,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:39,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:39,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:35:39,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:42,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:46,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:35:46,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 23:35:47,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:35:49,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:51,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:53,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:53,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:35:58,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:36:01,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 23:36:02,717 INFO [train.py:1039] (2/4) Epoch 15, batch 4500, loss[loss=0.1586, simple_loss=0.2285, pruned_loss=0.04431, over 24299.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2611, pruned_loss=0.05775, over 4704473.52 frames. ], batch size: 56, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:36:02,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:36:07,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:08,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 23:36:08,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 23:36:11,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:15,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:36:15,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:17,836 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.874e+02 2.112e+02 2.381e+02 3.744e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:36:17,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:36:18,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:36:19,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:19,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:23,497 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=525866.6666666666, ans=0.0 2023-09-29 23:36:29,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=525866.6666666666, ans=0.0 2023-09-29 23:36:30,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:32,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:36:35,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:36:35,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:36:37,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:36:43,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:36:47,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:36:51,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:36:56,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:36:56,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 23:36:57,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:36:57,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:36:59,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:00,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:37:02,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:37:02,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 23:37:02,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:37:02,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:05,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:37:05,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:37:09,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:11,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:37:12,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:37:13,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 23:37:16,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 23:37:16,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 23:37:18,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 23:37:23,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 23:37:24,399 INFO [train.py:1039] (2/4) Epoch 15, batch 4550, loss[loss=0.1835, simple_loss=0.2659, pruned_loss=0.05057, over 24055.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.26, pruned_loss=0.05742, over 4686036.76 frames. ], batch size: 80, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:37:25,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:29,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:29,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:31,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:36,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=526133.3333333334, ans=0.125 2023-09-29 23:37:37,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:37:39,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:40,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:37:40,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:37:40,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:43,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:43,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:46,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:37:50,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 23:37:50,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 23:37:51,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:37:53,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 23:37:58,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 23:37:59,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:38:04,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 23:38:04,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=526266.6666666666, ans=0.035 2023-09-29 23:38:04,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=526266.6666666666, ans=0.09899494936611666 2023-09-29 23:38:05,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=526266.6666666666, ans=0.0 2023-09-29 23:38:07,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:38:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:08,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:10,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:38:10,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=526266.6666666666, ans=0.125 2023-09-29 23:38:11,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 23:38:15,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:16,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:18,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:38:18,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:21,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 23:38:21,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 23:38:21,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:38:22,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 23:38:25,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 23:38:25,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:27,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:27,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:29,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:29,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:38:32,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:38:32,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 23:38:34,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:34,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:38:36,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 23:38:36,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:38:36,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 23:38:36,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=526400.0, ans=0.125 2023-09-29 23:38:39,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:38:39,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:38:41,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=526400.0, ans=0.1 2023-09-29 23:38:42,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:38:42,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:42,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:38:45,637 INFO [train.py:1039] (2/4) Epoch 15, batch 4600, loss[loss=0.1846, simple_loss=0.2637, pruned_loss=0.05278, over 24650.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.259, pruned_loss=0.05662, over 4705454.40 frames. ], batch size: 65, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:38:45,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:38:46,641 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.51 vs. limit=15.0 2023-09-29 23:38:47,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:38:50,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:51,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:52,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=526466.6666666666, ans=0.125 2023-09-29 23:38:54,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:38:55,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:38:55,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:38:56,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 23:38:58,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:38:59,841 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.932e+02 2.167e+02 2.436e+02 3.970e+02, threshold=4.334e+02, percent-clipped=0.0 2023-09-29 23:39:02,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:39:04,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:04,593 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.27 vs. limit=10.0 2023-09-29 23:39:06,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:12,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 23:39:13,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=526533.3333333334, ans=0.0 2023-09-29 23:39:14,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:16,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:16,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=526533.3333333334, ans=0.0 2023-09-29 23:39:20,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:39:20,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:23,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 23:39:23,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:39:25,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:39:31,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:33,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:39:33,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:39:36,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=526666.6666666666, ans=0.125 2023-09-29 23:39:36,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=526666.6666666666, ans=0.1 2023-09-29 23:39:38,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 23:39:39,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:39:42,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=526666.6666666666, ans=0.125 2023-09-29 23:39:44,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:45,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:39:48,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:48,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 23:39:49,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:49,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 23:39:49,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:51,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:51,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:52,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:52,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:54,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 23:39:54,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 23:39:54,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 23:39:54,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:56,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:39:57,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:57,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:40:04,163 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.42 vs. limit=22.5 2023-09-29 23:40:08,225 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.00 vs. limit=6.0 2023-09-29 23:40:09,564 INFO [train.py:1039] (2/4) Epoch 15, batch 4650, loss[loss=0.1965, simple_loss=0.2695, pruned_loss=0.06179, over 23566.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2582, pruned_loss=0.05617, over 4705090.77 frames. ], batch size: 120, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:40:11,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:40:12,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:15,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:15,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:40:15,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:40:15,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:16,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:19,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 23:40:23,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:40:24,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 23:40:24,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:26,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 23:40:26,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:40:27,036 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.02 vs. limit=6.0 2023-09-29 23:40:27,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 23:40:27,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 23:40:27,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:29,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:40:32,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:40:33,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:33,762 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 23:40:36,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:36,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 23:40:40,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:40,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:40:42,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 23:40:45,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:40:46,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=526933.3333333334, ans=0.125 2023-09-29 23:40:47,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:40:52,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:57,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:00,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:41:03,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 23:41:04,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 23:41:06,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 23:41:06,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 23:41:07,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:15,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:41:15,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:15,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 23:41:16,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:17,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:17,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:41:21,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:41:24,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:41:24,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:25,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:28,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:28,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:41:29,134 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=527066.6666666666, ans=0.125 2023-09-29 23:41:30,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:41:30,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 23:41:30,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:41:32,027 INFO [train.py:1039] (2/4) Epoch 15, batch 4700, loss[loss=0.1955, simple_loss=0.266, pruned_loss=0.06244, over 23356.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2598, pruned_loss=0.05658, over 4711796.19 frames. ], batch size: 119, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:41:33,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 23:41:41,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:42,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:42,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:41:44,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:44,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:41:48,314 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.872e+02 2.063e+02 2.349e+02 3.516e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-29 23:41:50,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 23:41:51,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 23:41:53,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:55,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:41:55,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:57,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=527200.0, ans=0.125 2023-09-29 23:41:58,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=527200.0, ans=0.125 2023-09-29 23:41:59,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:03,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=527266.6666666666, ans=0.07 2023-09-29 23:42:06,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:42:06,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=527266.6666666666, ans=0.0 2023-09-29 23:42:07,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:42:09,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:42:15,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 23:42:15,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:42:18,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:18,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=527333.3333333334, ans=0.2 2023-09-29 23:42:22,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 23:42:23,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:42:29,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:42:30,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 23:42:32,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:32,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:34,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:36,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:42:36,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 23:42:37,686 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 23:42:38,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=527400.0, ans=0.2 2023-09-29 23:42:39,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:40,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 23:42:41,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:44,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 23:42:47,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:42:48,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,231 INFO [train.py:1039] (2/4) Epoch 15, batch 4750, loss[loss=0.1923, simple_loss=0.259, pruned_loss=0.06278, over 23750.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2609, pruned_loss=0.05738, over 4708050.24 frames. ], batch size: 179, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:42:53,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:42:57,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 23:42:58,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:03,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 23:43:06,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:43:06,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:43:08,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:14,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 23:43:15,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=527533.3333333334, ans=0.0 2023-09-29 23:43:18,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:43:21,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 23:43:22,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:24,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:24,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:25,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:25,900 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 23:43:25,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 23:43:29,430 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.84 vs. limit=6.0 2023-09-29 23:43:33,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 23:43:36,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:38,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:43:40,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=527666.6666666666, ans=0.0 2023-09-29 23:43:42,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:43:42,127 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 23:43:42,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:43:45,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:43:48,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:43:50,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 23:43:50,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 23:43:50,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:52,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:43:52,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:52,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=527666.6666666666, ans=0.1 2023-09-29 23:43:53,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:43:53,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 23:43:56,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 23:43:59,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:02,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:44:02,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 23:44:02,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:04,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:05,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:44:07,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:07,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:44:12,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:12,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 23:44:14,008 INFO [train.py:1039] (2/4) Epoch 15, batch 4800, loss[loss=0.1618, simple_loss=0.2441, pruned_loss=0.0398, over 24448.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2611, pruned_loss=0.05687, over 4715441.02 frames. ], batch size: 63, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:44:14,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 23:44:14,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=527800.0, ans=0.0 2023-09-29 23:44:15,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 23:44:18,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:44:19,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:19,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 23:44:27,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:30,099 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.901e+02 2.248e+02 2.763e+02 5.522e+02, threshold=4.496e+02, percent-clipped=3.0 2023-09-29 23:44:31,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:44:33,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:33,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:34,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 23:44:34,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:34,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:44:37,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:44:41,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:44:42,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:44:44,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:44:44,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:45,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:49,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:53,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:55,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:55,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:44:56,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:44:58,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:01,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 23:45:01,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 23:45:03,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:03,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:45:03,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:45:03,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:03,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:45:06,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:45:06,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:09,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:11,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:12,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:17,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 23:45:17,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:17,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:19,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:45:19,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:24,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:24,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:45:24,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:26,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:45:26,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:45:28,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:45:32,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:32,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:32,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:36,019 INFO [train.py:1039] (2/4) Epoch 15, batch 4850, loss[loss=0.1821, simple_loss=0.2511, pruned_loss=0.05654, over 23662.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2619, pruned_loss=0.05806, over 4700796.24 frames. ], batch size: 149, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:45:36,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 23:45:37,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 23:45:37,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:37,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:40,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:45:40,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:41,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=528133.3333333334, ans=0.1 2023-09-29 23:45:42,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:47,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=528133.3333333334, ans=0.125 2023-09-29 23:45:48,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 23:45:52,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:57,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:45:58,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:45:58,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:02,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:46:02,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:46:04,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:46:04,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 23:46:09,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:46:11,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:46:11,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:46:11,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=528266.6666666666, ans=0.0 2023-09-29 23:46:12,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:46:12,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 23:46:15,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:46:15,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:18,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:19,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 23:46:19,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 23:46:20,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:46:26,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=528333.3333333334, ans=0.025 2023-09-29 23:46:28,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:46:29,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 23:46:31,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:46:31,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:46:32,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:46:35,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 23:46:35,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:35,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 23:46:35,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:37,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:46:38,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 23:46:42,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=528400.0, ans=0.2 2023-09-29 23:46:48,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:54,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:46:55,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:46:58,231 INFO [train.py:1039] (2/4) Epoch 15, batch 4900, loss[loss=0.181, simple_loss=0.2566, pruned_loss=0.05274, over 24092.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2611, pruned_loss=0.05825, over 4683629.54 frames. ], batch size: 86, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:46:58,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 23:46:58,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:47:04,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=528466.6666666666, ans=0.0 2023-09-29 23:47:04,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=528466.6666666666, ans=0.0 2023-09-29 23:47:05,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:07,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:07,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:47:08,167 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.71 vs. limit=15.0 2023-09-29 23:47:11,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 23:47:14,802 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.963e+02 2.199e+02 2.506e+02 3.437e+02, threshold=4.398e+02, percent-clipped=0.0 2023-09-29 23:47:16,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 23:47:21,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 23:47:23,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 23:47:23,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:23,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:23,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:47:23,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:23,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:47:24,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 23:47:26,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=528533.3333333334, ans=0.0 2023-09-29 23:47:27,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.90 vs. limit=15.0 2023-09-29 23:47:28,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 23:47:28,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:47:30,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:47:30,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:33,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:47:33,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:33,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:33,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 23:47:33,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=528600.0, ans=0.125 2023-09-29 23:47:37,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:47:39,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:39,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 23:47:39,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 23:47:44,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 23:47:47,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:47:47,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:47:47,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:47:47,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:47,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:47:47,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=528666.6666666666, ans=0.125 2023-09-29 23:47:49,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:47:49,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 23:47:51,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:54,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:47:55,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:47:59,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=528666.6666666666, ans=0.0 2023-09-29 23:48:00,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 23:48:02,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:48:02,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 23:48:03,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 23:48:10,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:12,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:14,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 23:48:14,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:14,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:48:17,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:20,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:20,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:48:20,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:20,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:48:21,882 INFO [train.py:1039] (2/4) Epoch 15, batch 4950, loss[loss=0.1898, simple_loss=0.2679, pruned_loss=0.05578, over 24411.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.259, pruned_loss=0.05739, over 4689712.98 frames. ], batch size: 77, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:48:22,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:48:22,261 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=528800.0, ans=0.125 2023-09-29 23:48:25,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:25,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:28,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 23:48:28,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 23:48:30,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:48:30,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 23:48:31,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:31,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:48:31,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:48:31,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:35,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:35,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:48:37,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:48:38,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:40,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=528866.6666666666, ans=0.09899494936611666 2023-09-29 23:48:40,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.24 vs. limit=6.0 2023-09-29 23:48:41,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:41,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:42,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=528866.6666666666, ans=0.125 2023-09-29 23:48:46,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:48:52,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:53,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:55,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:55,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:56,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:48:57,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 23:48:58,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 23:49:01,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:03,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:49:03,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:49:03,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=528933.3333333334, ans=0.1 2023-09-29 23:49:05,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:05,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:49:07,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:49:08,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:11,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:49:14,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:49:14,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:16,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:16,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 23:49:18,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:49:19,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:49:25,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:49:26,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:49:26,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:49:26,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:28,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:49:28,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:49:31,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:49:31,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:49:31,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:32,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 23:49:34,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:49:38,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=529066.6666666666, ans=0.125 2023-09-29 23:49:41,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 23:49:41,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:49:43,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=529133.3333333334, ans=0.125 2023-09-29 23:49:44,491 INFO [train.py:1039] (2/4) Epoch 15, batch 5000, loss[loss=0.2002, simple_loss=0.265, pruned_loss=0.06764, over 23795.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2584, pruned_loss=0.05742, over 4684466.24 frames. ], batch size: 164, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:49:48,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:48,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:49:49,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 23:49:51,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 23:49:51,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:49:55,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 23:49:55,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:55,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:49:57,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 23:49:58,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:58,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:49:59,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 23:49:59,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:01,015 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.874e+02 2.133e+02 2.483e+02 3.662e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-29 23:50:01,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:02,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 23:50:02,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 23:50:04,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:50:04,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 23:50:04,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:50:04,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:05,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:50:05,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 23:50:05,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 23:50:07,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 23:50:07,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:07,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:09,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 23:50:09,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:50:09,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=529200.0, ans=0.125 2023-09-29 23:50:13,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:14,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:16,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 23:50:17,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 23:50:17,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:50:21,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:50:26,017 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 23:50:29,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:50:29,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=529266.6666666666, ans=0.0 2023-09-29 23:50:31,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:31,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:50:33,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 23:50:33,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:33,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:34,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:50:35,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=529333.3333333334, ans=0.125 2023-09-29 23:50:36,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:50:38,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:42,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:42,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:50:44,874 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.06 vs. limit=22.5 2023-09-29 23:50:46,481 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:50:47,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 23:50:54,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:02,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:03,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:03,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:51:03,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:03,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:51:03,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:51:04,161 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=529400.0, ans=0.0 2023-09-29 23:51:05,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:07,337 INFO [train.py:1039] (2/4) Epoch 15, batch 5050, loss[loss=0.1812, simple_loss=0.2642, pruned_loss=0.04912, over 24016.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2595, pruned_loss=0.05707, over 4700573.49 frames. ], batch size: 80, lr: 6.80e-03, grad_scale: 8.0 2023-09-29 23:51:11,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:11,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 23:51:12,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:51:16,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:17,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:51:17,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 23:51:19,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:19,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:51:22,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:51:24,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:51:24,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:51:30,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=529533.3333333334, ans=0.125 2023-09-29 23:51:34,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 23:51:34,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:51:36,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:36,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 23:51:36,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:51:36,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=529533.3333333334, ans=0.125 2023-09-29 23:51:37,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:37,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=529600.0, ans=0.1 2023-09-29 23:51:39,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:39,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:51:39,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 23:51:40,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 23:51:40,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:44,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:51:47,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:47,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 23:51:50,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:51:52,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 23:51:56,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:51:56,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:51:56,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:56,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:59,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:51:59,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=529666.6666666666, ans=0.125 2023-09-29 23:52:00,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:52:02,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:02,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:52:02,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:52:02,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 23:52:04,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:52:06,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:52:10,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:52:10,801 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 23:52:10,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:52:10,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:12,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:12,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 23:52:14,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:14,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 23:52:14,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:14,479 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:52:19,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:19,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:19,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 23:52:21,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 23:52:21,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=529733.3333333334, ans=0.125 2023-09-29 23:52:24,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:24,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:24,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:52:27,228 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=12.0 2023-09-29 23:52:27,848 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 23:52:29,665 INFO [train.py:1039] (2/4) Epoch 15, batch 5100, loss[loss=0.1739, simple_loss=0.2473, pruned_loss=0.05019, over 23733.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2607, pruned_loss=0.05797, over 4698076.60 frames. ], batch size: 149, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:52:32,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:35,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 23:52:35,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 23:52:38,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:39,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:52:41,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:42,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 23:52:42,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 23:52:47,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.887e+02 2.076e+02 2.309e+02 3.546e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-29 23:52:47,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:47,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:52:52,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:54,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 23:52:55,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:57,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:57,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:53:01,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:01,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:02,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 23:53:06,354 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 23:53:07,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:07,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 23:53:07,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 23:53:13,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:53:23,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:53:26,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 23:53:27,511 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 23:53:27,526 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 23:53:29,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 23:53:29,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:32,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 23:53:34,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=530066.6666666666, ans=0.125 2023-09-29 23:53:35,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 23:53:37,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:53:39,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:53:40,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 23:53:42,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:53:42,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 23:53:49,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:53:49,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:53:49,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:53:50,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:53:50,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:53:51,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=530133.3333333334, ans=0.125 2023-09-29 23:53:52,306 INFO [train.py:1039] (2/4) Epoch 15, batch 5150, loss[loss=0.1901, simple_loss=0.262, pruned_loss=0.05911, over 24639.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2615, pruned_loss=0.05843, over 4706981.41 frames. ], batch size: 60, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:53:52,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:53:53,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 23:53:53,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 23:53:55,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 23:53:55,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:53:55,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 23:53:58,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:00,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 23:54:01,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:03,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:06,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:54:06,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 23:54:08,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:09,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:54:11,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:54:11,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:11,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:12,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:54:12,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:54:13,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 23:54:14,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:54:16,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:18,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:54:19,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 23:54:19,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:54:27,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:54:27,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 23:54:31,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:37,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:37,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:39,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=530266.6666666666, ans=0.09899494936611666 2023-09-29 23:54:43,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:43,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:45,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 23:54:50,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:51,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:54:51,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:52,803 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.78 vs. limit=15.0 2023-09-29 23:54:55,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:55,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=530333.3333333334, ans=0.1 2023-09-29 23:54:56,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:58,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 23:55:05,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:07,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:55:10,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:55:10,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:55:11,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:55:11,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:55:11,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:55:13,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:55:14,502 INFO [train.py:1039] (2/4) Epoch 15, batch 5200, loss[loss=0.194, simple_loss=0.2513, pruned_loss=0.06838, over 23597.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2618, pruned_loss=0.05821, over 4714881.96 frames. ], batch size: 256, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:55:16,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:55:17,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:55:19,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:24,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 23:55:24,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=530466.6666666666, ans=0.0 2023-09-29 23:55:26,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:55:27,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:31,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:31,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:55:31,481 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:55:32,506 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.846e+02 2.066e+02 2.366e+02 4.637e+02, threshold=4.132e+02, percent-clipped=1.0 2023-09-29 23:55:32,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:34,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 23:55:35,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=530533.3333333334, ans=15.0 2023-09-29 23:55:36,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:55:36,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:36,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=530533.3333333334, ans=0.125 2023-09-29 23:55:38,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 23:55:42,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:55:44,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:55:44,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 23:55:44,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 23:55:47,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 23:55:49,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:49,503 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 23:55:49,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:51,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:51,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:55:52,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 23:55:53,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:55:55,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:56,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=530600.0, ans=0.1 2023-09-29 23:55:59,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 23:55:59,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 23:55:59,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 23:56:07,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 23:56:09,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:56:14,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:56:14,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:15,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 23:56:15,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:56:16,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:56:16,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:17,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:56:21,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:22,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:56:25,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:56:26,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=530733.3333333334, ans=0.125 2023-09-29 23:56:27,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:27,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:27,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=530733.3333333334, ans=0.125 2023-09-29 23:56:30,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:32,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 23:56:33,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:33,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:56:35,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:35,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=530800.0, ans=0.125 2023-09-29 23:56:36,891 INFO [train.py:1039] (2/4) Epoch 15, batch 5250, loss[loss=0.1847, simple_loss=0.2388, pruned_loss=0.06533, over 22667.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2613, pruned_loss=0.05801, over 4709796.84 frames. ], batch size: 322, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:56:36,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:56:37,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:56:37,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=530800.0, ans=0.125 2023-09-29 23:56:42,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:56:44,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:44,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=530800.0, ans=0.125 2023-09-29 23:56:45,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:56:47,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:56:52,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:52,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=530866.6666666666, ans=0.125 2023-09-29 23:56:55,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:56:57,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:56:58,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:56:59,129 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=530866.6666666666, ans=0.125 2023-09-29 23:57:01,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 23:57:01,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:57:02,117 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:57:03,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:57:17,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=530933.3333333334, ans=0.125 2023-09-29 23:57:19,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=530933.3333333334, ans=0.125 2023-09-29 23:57:32,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=531000.0, ans=0.0 2023-09-29 23:57:52,127 INFO [train.py:1039] (2/4) Epoch 15, batch 5300, loss[loss=0.1554, simple_loss=0.2274, pruned_loss=0.04165, over 24303.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2598, pruned_loss=0.05717, over 4718505.47 frames. ], batch size: 56, lr: 6.78e-03, grad_scale: 16.0 2023-09-29 23:57:53,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=531133.3333333334, ans=0.0 2023-09-29 23:58:05,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=531200.0, ans=0.2 2023-09-29 23:58:06,814 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.927e+02 2.161e+02 2.637e+02 4.366e+02, threshold=4.323e+02, percent-clipped=1.0 2023-09-29 23:58:06,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:58:07,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 23:58:07,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 23:58:07,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:07,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:07,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:08,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:08,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:08,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:08,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:08,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:58:08,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:58:08,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 23:58:09,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 23:58:09,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 23:58:09,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:58:09,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 23:58:09,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 23:58:09,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:10,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:10,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:10,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:10,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:58:11,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:11,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:11,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:11,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:11,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:11,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:58:11,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:11,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:58:12,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 23:58:12,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:13,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:13,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 23:58:13,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 23:58:13,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:58:13,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:13,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 23:58:13,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 23:58:13,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:14,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:58:14,588 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:14,736 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 23:58:14,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 23:58:14,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:58:15,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:15,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 23:58:15,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 23:58:15,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 23:58:16,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:24,968 INFO [train.py:1039] (2/4) Epoch 16, batch 0, loss[loss=0.2084, simple_loss=0.2902, pruned_loss=0.06334, over 24462.00 frames. ], tot_loss[loss=0.2084, simple_loss=0.2902, pruned_loss=0.06334, over 24462.00 frames. ], batch size: 77, lr: 6.57e-03, grad_scale: 32.0 2023-09-29 23:58:24,969 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-29 23:58:41,233 INFO [train.py:1071] (2/4) Epoch 16, validation: loss=0.3148, simple_loss=0.2815, pruned_loss=0.174, over 1125622.00 frames. 2023-09-29 23:58:41,234 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-29 23:58:41,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 23:58:42,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:58:44,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:58:45,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.03 vs. limit=12.0 2023-09-29 23:58:51,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:53,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:58:53,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:54,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 23:58:57,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 23:58:57,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:59,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:59:02,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:59:02,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:02,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:59:04,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:04,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 23:59:08,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:15,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:59:15,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:17,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 23:59:18,105 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=531346.6666666666, ans=0.2 2023-09-29 23:59:20,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.34 vs. limit=15.0 2023-09-29 23:59:21,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:59:21,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:59:22,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:24,252 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-09-29 23:59:27,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:59:32,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:36,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-09-29 23:59:38,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 23:59:41,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 23:59:42,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=531413.3333333334, ans=0.2 2023-09-29 23:59:44,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:59:44,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:44,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:59:45,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:47,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 23:59:50,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:50,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:55,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:59:59,033 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 00:00:01,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:00:03,436 INFO [train.py:1039] (2/4) Epoch 16, batch 50, loss[loss=0.1744, simple_loss=0.2564, pruned_loss=0.04618, over 24072.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2639, pruned_loss=0.05644, over 1048460.96 frames. ], batch size: 80, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:00:03,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:06,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:06,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 00:00:08,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:00:09,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:00:11,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:11,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=531546.6666666666, ans=0.125 2023-09-30 00:00:14,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:15,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:18,269 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-09-30 00:00:19,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 00:00:19,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:25,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:00:27,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 00:00:29,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 00:00:31,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:00:34,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:00:34,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:34,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:00:36,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:00:37,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:00:37,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:40,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.76 vs. limit=22.5 2023-09-30 00:00:44,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:00:47,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:00:47,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:00:48,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 00:00:50,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:00:51,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:00:51,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 00:00:53,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:55,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 00:01:03,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:03,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:01:04,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:08,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:08,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:13,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 00:01:13,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 00:01:14,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:14,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:16,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=531813.3333333334, ans=0.125 2023-09-30 00:01:17,332 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.44 vs. limit=15.0 2023-09-30 00:01:17,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:01:17,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:01:19,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 00:01:19,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 00:01:21,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.74 vs. limit=12.0 2023-09-30 00:01:22,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 00:01:23,579 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.948e+02 2.203e+02 2.562e+02 3.872e+02, threshold=4.407e+02, percent-clipped=0.0 2023-09-30 00:01:23,621 INFO [train.py:1039] (2/4) Epoch 16, batch 100, loss[loss=0.1854, simple_loss=0.2636, pruned_loss=0.05358, over 23980.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2625, pruned_loss=0.05551, over 1870884.06 frames. ], batch size: 86, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:01:23,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:23,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:01:25,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 00:01:25,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 00:01:25,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:27,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:28,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:01:28,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:01:33,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:01:35,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:01:38,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:38,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 00:01:38,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:43,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:01:43,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:43,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:43,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:45,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:45,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 00:01:49,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:01:49,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:49,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:49,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:52,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 00:01:54,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:56,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:57,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:01:59,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:02:02,135 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 00:02:02,159 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 00:02:02,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532013.3333333334, ans=0.1 2023-09-30 00:02:03,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:03,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:02:08,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:02:10,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:02:12,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:18,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:18,348 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 00:02:18,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=532080.0, ans=0.1 2023-09-30 00:02:22,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:02:25,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:02:27,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:02:28,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:33,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:34,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:36,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:02:40,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:40,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:41,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:41,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:02:41,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:43,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 00:02:43,341 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 00:02:43,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:43,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:02:43,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:43,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:43,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 00:02:45,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:02:45,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:02:45,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:46,474 INFO [train.py:1039] (2/4) Epoch 16, batch 150, loss[loss=0.1648, simple_loss=0.2526, pruned_loss=0.03851, over 24546.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2635, pruned_loss=0.05767, over 2504700.59 frames. ], batch size: 71, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:02:47,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:48,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:48,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:02:49,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:02:51,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:54,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:54,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:02:56,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:00,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:00,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:03,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:03:04,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:07,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 00:03:07,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 00:03:07,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 00:03:10,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:03:10,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:03:10,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:03:12,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:03:12,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:13,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:13,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:14,679 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 00:03:17,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:22,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:25,207 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-09-30 00:03:26,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:03:28,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 00:03:31,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:03:31,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:31,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:03:33,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:03:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:37,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:03:38,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:39,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 00:03:44,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:46,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:03:46,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:03:46,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:03:49,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:52,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 00:03:55,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:03:57,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532480.0, ans=0.1 2023-09-30 00:03:57,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-09-30 00:03:58,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:04:00,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:03,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:04:04,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 00:04:05,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:04:05,032 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 00:04:08,511 INFO [train.py:1039] (2/4) Epoch 16, batch 200, loss[loss=0.1892, simple_loss=0.2533, pruned_loss=0.06258, over 23685.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2651, pruned_loss=0.05885, over 2994808.88 frames. ], batch size: 135, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:04:10,023 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.410e+02 1.995e+02 2.387e+02 2.784e+02 4.621e+02, threshold=4.773e+02, percent-clipped=1.0 2023-09-30 00:04:10,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:12,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:04:13,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:04:17,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 00:04:18,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:18,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:22,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 00:04:23,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:04:23,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:25,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:28,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:04:28,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:30,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:33,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=532613.3333333334, ans=0.0 2023-09-30 00:04:43,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.73 vs. limit=22.5 2023-09-30 00:04:49,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:04:49,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:04:50,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:04:52,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:04:52,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:04:52,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:04:55,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:57,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:04:59,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:59,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:00,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 00:05:02,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:05:02,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:06,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:05:09,242 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.30 vs. limit=12.0 2023-09-30 00:05:12,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:05:14,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=532813.3333333334, ans=0.2 2023-09-30 00:05:18,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:18,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:05:27,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:27,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=532813.3333333334, ans=0.0 2023-09-30 00:05:30,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 00:05:32,401 INFO [train.py:1039] (2/4) Epoch 16, batch 250, loss[loss=0.1815, simple_loss=0.2438, pruned_loss=0.05965, over 23293.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2643, pruned_loss=0.05884, over 3370370.19 frames. ], batch size: 285, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:05:32,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:32,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:05:32,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:33,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:05:34,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 00:05:34,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:05:34,298 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 00:05:37,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:38,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:05:40,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:40,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=532880.0, ans=0.0 2023-09-30 00:05:41,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:44,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:05:45,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:47,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:05:50,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:05:52,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532946.6666666666, ans=0.1 2023-09-30 00:06:00,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=532946.6666666666, ans=0.5 2023-09-30 00:06:03,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:05,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:05,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:06:12,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:06:12,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:06:13,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:06:15,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:15,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:06:15,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:06:16,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:19,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:06:21,635 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.12 vs. limit=8.0 2023-09-30 00:06:22,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 00:06:22,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff2.min_abs, batch_count=533080.0, ans=0.1 2023-09-30 00:06:23,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:24,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:06:25,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:06:25,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:06:25,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:27,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:06:27,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:06:30,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:31,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:06:32,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:32,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=533080.0, ans=0.125 2023-09-30 00:06:38,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:06:40,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:44,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:06:47,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=533146.6666666666, ans=0.0 2023-09-30 00:06:48,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:50,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:06:53,707 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=533213.3333333334, ans=0.0 2023-09-30 00:06:54,757 INFO [train.py:1039] (2/4) Epoch 16, batch 300, loss[loss=0.1844, simple_loss=0.2603, pruned_loss=0.05432, over 24297.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2626, pruned_loss=0.05778, over 3676308.88 frames. ], batch size: 61, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:06:54,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 00:06:55,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:55,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:56,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=533213.3333333334, ans=0.09899494936611666 2023-09-30 00:06:56,944 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.879e+02 2.129e+02 2.398e+02 3.317e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-30 00:06:57,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 00:06:58,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:07:00,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:07:00,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 00:07:05,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:07:07,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:10,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:07:10,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 00:07:12,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:07:13,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:07:13,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 00:07:15,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:18,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:07:23,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:07:23,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 00:07:30,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 00:07:30,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:33,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:35,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:35,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 00:07:35,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:07:37,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:07:39,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:07:41,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:07:47,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:07:47,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 00:07:48,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:07:49,476 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.49 vs. limit=15.0 2023-09-30 00:07:52,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:53,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 00:07:54,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:59,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:01,315 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-09-30 00:08:02,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:08:02,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 00:08:06,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:06,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:08:09,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:11,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:08:11,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 00:08:11,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:08:13,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:13,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=533480.0, ans=10.0 2023-09-30 00:08:15,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 00:08:17,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:18,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:20,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:20,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:20,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:22,261 INFO [train.py:1039] (2/4) Epoch 16, batch 350, loss[loss=0.1944, simple_loss=0.2834, pruned_loss=0.05275, over 24358.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2609, pruned_loss=0.05697, over 3910021.71 frames. ], batch size: 77, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:08:26,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:26,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 00:08:28,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:34,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:37,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:38,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:39,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=533613.3333333334, ans=0.125 2023-09-30 00:08:40,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 00:08:42,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:42,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 00:08:46,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:46,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 00:08:48,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:51,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 00:08:52,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:08:53,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=533613.3333333334, ans=0.0 2023-09-30 00:08:55,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:57,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:58,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:08:58,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:00,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:00,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:01,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:09:03,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:03,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:09,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:09,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:09:10,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:09:10,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:17,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 00:09:17,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:22,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:22,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:22,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:09:24,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 00:09:26,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:27,616 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 00:09:27,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 00:09:27,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:31,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:31,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 00:09:34,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:37,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:09:39,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:40,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:40,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:42,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:45,702 INFO [train.py:1039] (2/4) Epoch 16, batch 400, loss[loss=0.1874, simple_loss=0.2523, pruned_loss=0.06123, over 19094.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2607, pruned_loss=0.05645, over 4087218.77 frames. ], batch size: 41, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:09:45,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:47,263 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.835e+02 1.997e+02 2.324e+02 4.354e+02, threshold=3.993e+02, percent-clipped=1.0 2023-09-30 00:09:47,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:09:48,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 00:09:48,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:49,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:51,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:09:52,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:56,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:57,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:59,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 00:09:59,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=533880.0, ans=0.125 2023-09-30 00:10:01,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 00:10:01,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:03,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 00:10:03,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:04,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=533946.6666666666, ans=0.5 2023-09-30 00:10:08,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:10:08,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:08,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 00:10:10,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:10:10,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:10,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:11,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:10:13,165 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 00:10:14,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 00:10:19,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:20,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:22,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 00:10:23,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 00:10:27,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:10:28,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.16 vs. limit=15.0 2023-09-30 00:10:31,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:38,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 00:10:44,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:10:44,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 00:10:45,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:46,225 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:10:47,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:10:47,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 00:10:51,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=534146.6666666666, ans=0.0 2023-09-30 00:10:52,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:10:52,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=534146.6666666666, ans=0.125 2023-09-30 00:10:55,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:10:56,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:58,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:58,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 00:10:58,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=534146.6666666666, ans=0.0 2023-09-30 00:11:00,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:11:01,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 00:11:03,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:11:05,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:11:06,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 00:11:08,080 INFO [train.py:1039] (2/4) Epoch 16, batch 450, loss[loss=0.1596, simple_loss=0.241, pruned_loss=0.03917, over 24547.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2609, pruned_loss=0.05656, over 4216870.41 frames. ], batch size: 60, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:11:09,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:11:09,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:11:09,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:11:12,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 00:11:12,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:11:13,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=534213.3333333334, ans=0.125 2023-09-30 00:11:14,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:11:14,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:11:14,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 00:11:14,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:11:16,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:11:19,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:11:27,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=534280.0, ans=0.125 2023-09-30 00:11:29,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:29,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:11:30,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 00:11:30,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 00:11:36,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:11:37,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:40,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:43,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:45,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:45,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 00:11:47,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 00:11:49,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 00:11:49,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:11:51,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:52,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:11:54,504 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 00:11:54,518 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 00:11:54,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:56,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:11:57,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:12:00,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:12:00,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:12:00,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:12:02,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 00:12:05,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:08,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:12:08,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:12:09,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 00:12:14,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:12:14,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=534480.0, ans=0.125 2023-09-30 00:12:15,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 00:12:15,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 00:12:17,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:24,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:12:26,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:27,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:12:27,679 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 00:12:30,617 INFO [train.py:1039] (2/4) Epoch 16, batch 500, loss[loss=0.1733, simple_loss=0.2463, pruned_loss=0.05015, over 14779.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2608, pruned_loss=0.05645, over 4328347.33 frames. ], batch size: 31, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:12:32,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:32,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=534546.6666666666, ans=0.125 2023-09-30 00:12:33,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:12:35,022 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.824e+02 2.052e+02 2.354e+02 3.367e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 00:12:35,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:35,184 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 00:12:36,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 00:12:36,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:38,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=534546.6666666666, ans=0.2 2023-09-30 00:12:39,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:12:42,455 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.10 vs. limit=15.0 2023-09-30 00:12:44,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 00:12:44,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:12:48,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:48,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:49,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:03,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:03,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:13:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:13:03,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:04,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 00:13:04,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:13:08,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:13:08,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:13:08,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:13:08,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:09,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 00:13:12,796 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 00:13:14,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:14,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:13:20,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 00:13:24,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:13:25,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:31,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:35,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:41,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:44,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 00:13:44,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:44,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:47,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 00:13:47,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:13:48,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:49,608 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.49 vs. limit=15.0 2023-09-30 00:13:51,952 INFO [train.py:1039] (2/4) Epoch 16, batch 550, loss[loss=0.18, simple_loss=0.264, pruned_loss=0.04807, over 24566.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2611, pruned_loss=0.05635, over 4414134.13 frames. ], batch size: 71, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:13:56,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 00:13:57,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 00:13:57,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:57,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 00:13:59,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:13:59,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:59,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:14:01,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:14:04,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:14:06,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 00:14:06,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:14:11,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:11,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:14,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:15,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:19,329 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.21 vs. limit=15.0 2023-09-30 00:14:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 00:14:20,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 00:14:23,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:14:28,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:14:30,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:31,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:14:34,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:34,906 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 00:14:35,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:36,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:14:39,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:39,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:14:39,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:14:40,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:42,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 00:14:43,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 00:14:45,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:14:45,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:45,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:14:45,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:14:48,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:14:51,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:14:54,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:14:55,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:55,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=535080.0, ans=0.125 2023-09-30 00:14:56,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 00:14:58,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:15:00,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:02,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:15:03,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:05,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:15:05,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:15:12,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 00:15:14,317 INFO [train.py:1039] (2/4) Epoch 16, batch 600, loss[loss=0.1866, simple_loss=0.2453, pruned_loss=0.06391, over 23370.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2614, pruned_loss=0.05743, over 4464575.46 frames. ], batch size: 285, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:15:15,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 00:15:16,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:15:16,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=535213.3333333334, ans=0.1 2023-09-30 00:15:18,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:15:18,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:19,621 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.903e+02 2.137e+02 2.465e+02 5.407e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 00:15:26,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:15:28,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:15:29,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 00:15:31,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:15:34,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:15:37,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:38,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 00:15:40,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:15:43,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 00:15:49,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:15:49,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:50,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:15:55,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=535346.6666666666, ans=0.125 2023-09-30 00:15:56,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:15:56,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:15:57,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:01,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=535346.6666666666, ans=0.0 2023-09-30 00:16:04,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:16:08,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:08,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:16:08,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:16:16,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 00:16:21,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:16:21,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:16:26,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 00:16:26,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:16:29,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 00:16:29,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:16:29,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:16:37,771 INFO [train.py:1039] (2/4) Epoch 16, batch 650, loss[loss=0.1772, simple_loss=0.2662, pruned_loss=0.04413, over 24654.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2604, pruned_loss=0.05707, over 4531145.33 frames. ], batch size: 73, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:16:37,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:16:41,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:16:43,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:16:44,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:16:46,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:16:46,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=535546.6666666666, ans=0.0 2023-09-30 00:16:49,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 00:16:49,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:54,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=535613.3333333334, ans=0.1 2023-09-30 00:16:55,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.40 vs. limit=15.0 2023-09-30 00:16:55,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:16:55,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:16:59,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:16:59,584 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.99 vs. limit=15.0 2023-09-30 00:17:01,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=535613.3333333334, ans=0.125 2023-09-30 00:17:02,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 00:17:04,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:05,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:09,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:09,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:17:12,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:15,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:16,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:17:16,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:18,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:17:20,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:17:20,112 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 00:17:20,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:20,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:24,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:26,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:26,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:27,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:17:28,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=535746.6666666666, ans=0.125 2023-09-30 00:17:29,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 00:17:29,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:17:29,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:17:31,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:17:31,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:32,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:17:34,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 00:17:34,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 00:17:34,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:35,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:36,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:17:36,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:39,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:41,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=535746.6666666666, ans=0.125 2023-09-30 00:17:47,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:47,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:49,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:53,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:53,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:17:54,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:59,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:17:59,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:17:59,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:17:59,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:01,167 INFO [train.py:1039] (2/4) Epoch 16, batch 700, loss[loss=0.1745, simple_loss=0.2687, pruned_loss=0.04016, over 24665.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2601, pruned_loss=0.05634, over 4578369.98 frames. ], batch size: 68, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:18:05,523 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.452e+02 1.867e+02 2.174e+02 2.485e+02 3.899e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-30 00:18:05,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 00:18:07,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 00:18:10,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 00:18:11,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:12,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:18:14,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 00:18:19,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:18:22,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:18:24,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=535946.6666666666, ans=0.0 2023-09-30 00:18:25,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:26,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:18:27,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:18:29,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:31,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:18:31,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:18:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 00:18:37,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 00:18:40,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=536013.3333333334, ans=0.0 2023-09-30 00:18:41,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:18:41,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:18:44,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:18:49,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:18:49,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 00:18:54,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:56,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:18:56,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 00:19:01,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:19:01,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:04,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:10,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:19:10,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 00:19:14,090 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.37 vs. limit=22.5 2023-09-30 00:19:15,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 00:19:15,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 00:19:18,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:20,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:22,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:24,061 INFO [train.py:1039] (2/4) Epoch 16, batch 750, loss[loss=0.1923, simple_loss=0.2543, pruned_loss=0.06513, over 22837.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2594, pruned_loss=0.05594, over 4608276.56 frames. ], batch size: 322, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:19:24,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:24,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 00:19:28,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 00:19:28,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 00:19:28,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 00:19:31,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 00:19:31,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 00:19:31,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:19:32,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 00:19:33,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:34,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:19:36,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:39,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:39,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:19:39,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:41,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:19:42,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:19:44,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=536280.0, ans=0.125 2023-09-30 00:19:45,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:19:47,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=536280.0, ans=0.2 2023-09-30 00:19:48,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:48,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:48,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 00:19:50,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:19:52,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:53,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:55,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:19:56,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 00:19:56,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:59,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 00:19:59,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 00:20:00,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 00:20:00,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:20:02,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:20:05,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:20:11,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:20:12,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:12,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:20:13,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=536413.3333333334, ans=0.125 2023-09-30 00:20:14,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:20:17,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:17,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 00:20:17,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:20:19,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 00:20:20,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:20:20,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=536413.3333333334, ans=0.125 2023-09-30 00:20:23,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:20:23,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 00:20:25,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:26,407 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.73 vs. limit=15.0 2023-09-30 00:20:30,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:20:32,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:20:32,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:34,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:20:38,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 00:20:38,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:40,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:41,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:43,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:45,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:46,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:20:47,959 INFO [train.py:1039] (2/4) Epoch 16, batch 800, loss[loss=0.1835, simple_loss=0.2619, pruned_loss=0.05253, over 23390.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2603, pruned_loss=0.05623, over 4631067.84 frames. ], batch size: 93, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:20:52,608 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.946e+02 2.133e+02 2.496e+02 4.467e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 00:20:54,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:54,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:55,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:55,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:57,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:58,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:59,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:04,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:04,411 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:21:05,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:21:09,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 00:21:10,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:13,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:21:13,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:21:13,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:13,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 00:21:13,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:15,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 00:21:19,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:22,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:25,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:21:26,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:26,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=536680.0, ans=0.125 2023-09-30 00:21:28,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:28,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:29,045 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=11.11 vs. limit=10.0 2023-09-30 00:21:32,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:21:32,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:21:34,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 00:21:37,337 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 00:21:37,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 00:21:37,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:21:37,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:21:39,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:39,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:21:45,771 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 00:21:45,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 00:21:47,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:21:51,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:21:53,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:21:58,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:59,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 00:21:59,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:22:02,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 00:22:08,778 INFO [train.py:1039] (2/4) Epoch 16, batch 850, loss[loss=0.1664, simple_loss=0.245, pruned_loss=0.04389, over 20196.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2602, pruned_loss=0.05578, over 4650655.22 frames. ], batch size: 44, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:22:10,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:12,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:22:13,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 00:22:13,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:22:15,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:17,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 00:22:17,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:17,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=536880.0, ans=0.125 2023-09-30 00:22:18,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:22:20,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:20,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:22:24,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:22:25,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 00:22:25,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 00:22:25,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 00:22:27,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:28,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:22:30,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:30,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:30,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:22:35,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:35,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:22:35,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 00:22:38,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 00:22:43,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:44,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 00:22:46,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 00:22:50,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 00:22:53,513 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 00:22:53,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:22:53,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:22:53,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:22:55,842 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=537013.3333333334, ans=0.0 2023-09-30 00:22:57,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 00:23:00,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:23:02,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:03,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:23:04,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:23:05,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:23:06,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:23:07,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 00:23:11,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:23:11,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:13,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:23:13,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:14,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:15,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=537146.6666666666, ans=0.1 2023-09-30 00:23:17,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:23:19,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:23:21,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:23:22,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:22,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:23:29,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:23:31,173 INFO [train.py:1039] (2/4) Epoch 16, batch 900, loss[loss=0.1667, simple_loss=0.2451, pruned_loss=0.04419, over 24426.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2615, pruned_loss=0.05624, over 4660412.79 frames. ], batch size: 63, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:23:31,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:31,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 00:23:31,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:31,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:32,381 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.99 vs. limit=10.0 2023-09-30 00:23:33,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 00:23:36,672 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.050e+02 2.390e+02 2.977e+02 4.145e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-30 00:23:38,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:23:41,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:42,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 00:23:46,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:23:46,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 00:23:48,059 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:23:49,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:49,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:23:49,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:23:49,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:24:02,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:02,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:24:02,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:24:05,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:05,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=537346.6666666666, ans=0.125 2023-09-30 00:24:11,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 00:24:14,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:24:18,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:24:18,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:24:20,078 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 00:24:21,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 00:24:27,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:24:27,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:24:30,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:24:36,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:36,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:24:39,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 00:24:39,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:41,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 00:24:42,183 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.40 vs. limit=10.0 2023-09-30 00:24:43,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:24:43,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:45,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:24:45,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:24:50,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 00:24:50,607 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 00:24:52,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:24:52,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 00:24:54,996 INFO [train.py:1039] (2/4) Epoch 16, batch 950, loss[loss=0.1923, simple_loss=0.2575, pruned_loss=0.06356, over 23851.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2616, pruned_loss=0.05633, over 4682641.69 frames. ], batch size: 195, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:24:55,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:59,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 00:24:59,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=537546.6666666666, ans=0.0 2023-09-30 00:25:04,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:08,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:08,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:09,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:25:12,937 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 00:25:16,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:16,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:17,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:17,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:25:17,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 00:25:19,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:25:21,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:21,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 00:25:22,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:27,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:25:29,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 00:25:30,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:25:31,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=537680.0, ans=0.125 2023-09-30 00:25:32,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:35,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:25:40,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:25:40,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:45,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 00:25:46,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:25:46,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:25:48,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:48,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:48,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:25:53,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 00:25:54,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:25:58,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:59,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:59,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 00:25:59,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:59,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:25:59,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 00:26:04,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:26:04,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=537813.3333333334, ans=0.0 2023-09-30 00:26:07,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:26:07,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=537813.3333333334, ans=0.2 2023-09-30 00:26:10,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:13,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 00:26:13,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 00:26:16,025 INFO [train.py:1039] (2/4) Epoch 16, batch 1000, loss[loss=0.1796, simple_loss=0.2493, pruned_loss=0.05492, over 23493.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2606, pruned_loss=0.05639, over 4692629.44 frames. ], batch size: 134, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:26:16,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:26:20,689 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 2.021e+02 2.226e+02 2.513e+02 3.322e+02, threshold=4.453e+02, percent-clipped=0.0 2023-09-30 00:26:20,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 00:26:20,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:23,261 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.30 vs. limit=15.0 2023-09-30 00:26:24,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:26:24,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=537880.0, ans=0.125 2023-09-30 00:26:26,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 00:26:26,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 00:26:27,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.14 vs. limit=15.0 2023-09-30 00:26:31,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:31,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:32,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:34,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 00:26:40,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 00:26:42,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 00:26:43,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:26:46,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 00:26:46,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 00:26:46,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 00:26:49,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:50,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:57,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:59,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:27:01,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:01,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:01,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 00:27:01,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:02,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:27:02,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:27:04,449 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 00:27:06,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 00:27:07,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 00:27:09,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 00:27:10,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:27:17,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:17,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:27:19,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:20,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:27:21,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 00:27:23,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:27:23,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 00:27:24,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 00:27:26,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:27:26,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:27,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:27:31,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:27:33,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:33,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=538146.6666666666, ans=0.125 2023-09-30 00:27:36,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:27:37,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:27:39,056 INFO [train.py:1039] (2/4) Epoch 16, batch 1050, loss[loss=0.1772, simple_loss=0.2616, pruned_loss=0.04636, over 24401.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2584, pruned_loss=0.05595, over 4686852.78 frames. ], batch size: 77, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:27:39,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:27:40,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:43,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:27:43,886 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=538213.3333333334, ans=0.1 2023-09-30 00:27:47,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:27:48,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:27:52,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:27:53,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:27:53,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:27:55,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:27:55,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 00:27:57,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:27:58,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 00:28:00,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:28:00,970 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.46 vs. limit=12.0 2023-09-30 00:28:01,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.93 vs. limit=15.0 2023-09-30 00:28:01,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 00:28:01,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:28:07,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:28:08,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:28:09,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:28:12,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 00:28:12,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 00:28:12,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:28:15,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 00:28:19,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 00:28:19,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:23,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:28:27,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:28:27,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:28:27,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=538413.3333333334, ans=0.125 2023-09-30 00:28:28,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:28:31,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:28:36,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 00:28:37,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 00:28:38,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 00:28:38,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:39,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:28:40,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 00:28:40,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=538413.3333333334, ans=0.5 2023-09-30 00:28:40,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=538413.3333333334, ans=0.125 2023-09-30 00:28:40,805 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=22.5 2023-09-30 00:28:41,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=538413.3333333334, ans=0.125 2023-09-30 00:28:41,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=538413.3333333334, ans=0.2 2023-09-30 00:28:42,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.41 vs. limit=10.0 2023-09-30 00:28:44,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:28:47,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:47,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:28:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:49,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:54,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:54,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 00:28:55,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:55,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 00:28:55,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=538480.0, ans=0.0 2023-09-30 00:28:57,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 00:28:57,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:29:00,851 INFO [train.py:1039] (2/4) Epoch 16, batch 1100, loss[loss=0.1716, simple_loss=0.2519, pruned_loss=0.04562, over 24609.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2575, pruned_loss=0.05599, over 4687908.31 frames. ], batch size: 60, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:29:01,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:06,282 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.914e+02 2.113e+02 2.523e+02 4.579e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-30 00:29:07,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:29:08,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=538546.6666666666, ans=0.2 2023-09-30 00:29:14,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:29:15,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:29:15,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:17,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 00:29:19,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:29:21,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:29:21,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=538613.3333333334, ans=0.2 2023-09-30 00:29:24,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:29:25,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:29:25,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 00:29:27,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:29:27,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:29,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:29:30,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:29:33,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=538680.0, ans=0.1 2023-09-30 00:29:33,656 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.88 vs. limit=22.5 2023-09-30 00:29:34,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:29:35,147 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.19 vs. limit=6.0 2023-09-30 00:29:39,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:29:44,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 00:29:45,771 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 00:29:45,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:48,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:49,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:29:49,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:51,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 00:29:52,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:29:52,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:29:52,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:29:52,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:54,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 00:29:59,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:29:59,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 00:29:59,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=538746.6666666666, ans=0.125 2023-09-30 00:30:02,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:30:08,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:30:12,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 00:30:12,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:30:14,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:15,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:15,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:18,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 00:30:19,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:30:19,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:21,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 00:30:21,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:30:21,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=538813.3333333334, ans=0.125 2023-09-30 00:30:23,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 00:30:24,581 INFO [train.py:1039] (2/4) Epoch 16, batch 1150, loss[loss=0.2361, simple_loss=0.288, pruned_loss=0.09211, over 19048.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2586, pruned_loss=0.05565, over 4705330.77 frames. ], batch size: 388, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:30:24,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:30:24,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:30:26,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:30:31,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:34,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:30:36,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:36,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:30:36,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 00:30:36,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:30:39,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 00:30:39,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:39,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:30:45,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=538946.6666666666, ans=0.125 2023-09-30 00:30:46,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 00:30:47,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:50,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=538946.6666666666, ans=0.0 2023-09-30 00:30:53,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:54,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:30:54,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 00:30:56,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:30:56,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:31:01,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 00:31:01,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:02,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:31:10,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=539013.3333333334, ans=0.1 2023-09-30 00:31:13,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:19,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:19,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 00:31:20,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:21,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:26,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=539080.0, ans=0.0 2023-09-30 00:31:27,785 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 00:31:29,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:37,326 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 00:31:41,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:42,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:31:42,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:31:44,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:31:47,051 INFO [train.py:1039] (2/4) Epoch 16, batch 1200, loss[loss=0.1779, simple_loss=0.2572, pruned_loss=0.04932, over 24681.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2584, pruned_loss=0.05546, over 4721229.03 frames. ], batch size: 65, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:31:48,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:31:51,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=539213.3333333334, ans=0.2 2023-09-30 00:31:53,129 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.828e+02 2.089e+02 2.357e+02 3.548e+02, threshold=4.177e+02, percent-clipped=0.0 2023-09-30 00:31:55,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:31:55,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:31:57,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:57,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:57,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=539213.3333333334, ans=0.125 2023-09-30 00:31:58,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:32:00,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:32:02,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:32:03,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:03,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:07,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 00:32:10,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 00:32:13,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:32:16,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:32:19,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:22,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:32:22,088 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 00:32:22,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:22,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=539346.6666666666, ans=0.2 2023-09-30 00:32:24,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=539346.6666666666, ans=0.0 2023-09-30 00:32:28,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:32:29,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:32:29,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 00:32:29,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:32:34,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 00:32:38,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 00:32:39,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=539413.3333333334, ans=0.125 2023-09-30 00:32:40,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:41,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:44,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:45,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:32:45,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=539413.3333333334, ans=0.07 2023-09-30 00:32:46,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:46,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:32:47,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:32:48,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 00:32:48,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:32:48,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:32:48,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:32:50,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:50,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:52,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=539480.0, ans=0.0 2023-09-30 00:32:55,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:32:57,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:33:01,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 00:33:04,932 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 00:33:09,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:10,643 INFO [train.py:1039] (2/4) Epoch 16, batch 1250, loss[loss=0.1681, simple_loss=0.2445, pruned_loss=0.04586, over 24567.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2593, pruned_loss=0.05589, over 4727819.56 frames. ], batch size: 60, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:33:12,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:33:13,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:33:15,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:33:17,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 00:33:22,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:33:22,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:22,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 00:33:23,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:33:25,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:33:30,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:33:30,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:30,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=539613.3333333334, ans=0.05 2023-09-30 00:33:32,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:33:32,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:35,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:33:38,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 00:33:38,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:33:38,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:41,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:43,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:46,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:33:48,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:33:52,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 00:33:53,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:33:55,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:33:56,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 00:33:58,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:58,275 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 00:33:58,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:58,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:01,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:05,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=539746.6666666666, ans=0.05 2023-09-30 00:34:06,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:06,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:34:06,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=539746.6666666666, ans=0.1 2023-09-30 00:34:07,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.26 vs. limit=15.0 2023-09-30 00:34:08,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 00:34:08,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 00:34:08,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 00:34:11,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:13,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 00:34:13,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:15,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:34:15,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:34:18,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 00:34:18,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:34:19,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:34:19,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:34:20,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:34:23,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 00:34:27,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:28,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:34:29,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:34:31,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:34:33,157 INFO [train.py:1039] (2/4) Epoch 16, batch 1300, loss[loss=0.1858, simple_loss=0.2522, pruned_loss=0.05974, over 23344.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2613, pruned_loss=0.05708, over 4710649.20 frames. ], batch size: 285, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:34:36,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:36,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 00:34:39,913 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.902e+02 2.089e+02 2.370e+02 3.462e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 00:34:41,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:43,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:34:43,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:34:46,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.83 vs. limit=22.5 2023-09-30 00:34:46,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:48,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:34:48,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 00:34:53,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:34:53,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:34:55,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 00:35:00,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:35:03,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:05,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:05,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=540013.3333333334, ans=0.125 2023-09-30 00:35:06,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:35:07,828 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.52 vs. limit=15.0 2023-09-30 00:35:08,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:08,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:35:09,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:35:09,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 00:35:15,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=540013.3333333334, ans=12.0 2023-09-30 00:35:16,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:35:17,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:35:19,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 00:35:19,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:35:21,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:35:21,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=540080.0, ans=0.025 2023-09-30 00:35:23,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=540080.0, ans=0.5 2023-09-30 00:35:25,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:35:26,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 00:35:26,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:26,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 00:35:26,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=540080.0, ans=0.0 2023-09-30 00:35:28,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:33,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:33,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:35:36,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 00:35:38,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 00:35:39,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 00:35:42,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:35:45,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 00:35:46,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=540146.6666666666, ans=0.125 2023-09-30 00:35:47,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:54,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 00:35:56,307 INFO [train.py:1039] (2/4) Epoch 16, batch 1350, loss[loss=0.1876, simple_loss=0.2703, pruned_loss=0.0525, over 24024.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.26, pruned_loss=0.05678, over 4712832.67 frames. ], batch size: 80, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:35:59,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:02,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:06,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:36:07,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:08,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:36:09,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:12,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:13,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 00:36:15,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:16,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:36:18,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 00:36:19,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:36:20,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=540280.0, ans=0.0 2023-09-30 00:36:21,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:36:21,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 00:36:24,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 00:36:28,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 00:36:29,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:29,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 00:36:42,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:52,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:52,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:52,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 00:36:54,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=540413.3333333334, ans=0.125 2023-09-30 00:36:55,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:58,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 00:36:58,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:58,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:37:02,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:37:03,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=540480.0, ans=0.125 2023-09-30 00:37:04,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 00:37:07,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:37:13,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 00:37:14,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 00:37:17,854 INFO [train.py:1039] (2/4) Epoch 16, batch 1400, loss[loss=0.1821, simple_loss=0.2601, pruned_loss=0.05203, over 23385.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2589, pruned_loss=0.05634, over 4717820.42 frames. ], batch size: 93, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:37:18,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=540546.6666666666, ans=0.125 2023-09-30 00:37:19,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 00:37:22,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:37:23,988 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.842e+02 1.998e+02 2.370e+02 3.291e+02, threshold=3.996e+02, percent-clipped=0.0 2023-09-30 00:37:24,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:37:24,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:37:31,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 00:37:33,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 00:37:44,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:37:45,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:37:49,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:37:49,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:37:54,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:37:55,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:38:03,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:04,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:09,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 00:38:10,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:38:12,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:38:12,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:38:13,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:15,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:38:15,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:38:15,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:38:16,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 00:38:16,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:38:22,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:25,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:38:33,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 00:38:33,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:38:34,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:38:36,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:38:38,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:39,954 INFO [train.py:1039] (2/4) Epoch 16, batch 1450, loss[loss=0.1759, simple_loss=0.2603, pruned_loss=0.04582, over 24504.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2581, pruned_loss=0.05549, over 4727066.47 frames. ], batch size: 63, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:38:40,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:38:40,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=540880.0, ans=0.04949747468305833 2023-09-30 00:38:43,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:38:45,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:38:45,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:45,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:38:45,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=540880.0, ans=0.2 2023-09-30 00:38:50,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:51,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:38:53,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:54,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 00:38:55,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:38:57,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 00:38:57,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:57,716 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-09-30 00:38:58,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:38:58,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 00:39:00,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:00,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:39:01,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 00:39:01,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:03,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:39:03,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=540946.6666666666, ans=0.125 2023-09-30 00:39:04,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:07,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:11,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:39:11,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:39:13,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:39:13,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:13,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=541013.3333333334, ans=0.0 2023-09-30 00:39:14,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:14,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:39:16,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:16,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:21,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 00:39:24,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:25,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=541013.3333333334, ans=0.015 2023-09-30 00:39:28,502 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 00:39:30,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:39:31,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:33,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 00:39:37,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:39,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 00:39:39,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=541080.0, ans=0.125 2023-09-30 00:39:40,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 00:39:42,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:43,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:39:45,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:48,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 00:39:51,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 00:39:51,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 00:39:52,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:55,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:40:02,488 INFO [train.py:1039] (2/4) Epoch 16, batch 1500, loss[loss=0.1852, simple_loss=0.2733, pruned_loss=0.04855, over 24324.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2586, pruned_loss=0.0555, over 4731493.33 frames. ], batch size: 74, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:40:06,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 00:40:07,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:40:07,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:40:09,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:10,587 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.398e+02 1.908e+02 2.053e+02 2.386e+02 4.299e+02, threshold=4.105e+02, percent-clipped=2.0 2023-09-30 00:40:10,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:10,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:40:12,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 00:40:13,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:40:13,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:40:13,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:14,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.26 vs. limit=15.0 2023-09-30 00:40:15,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:40:18,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:40:18,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:23,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:23,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 00:40:25,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:40:25,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:40:26,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:29,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 00:40:35,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 00:40:37,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:38,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=541346.6666666666, ans=0.125 2023-09-30 00:40:39,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 00:40:40,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:40:42,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=541346.6666666666, ans=0.0 2023-09-30 00:40:43,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:40:45,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:45,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:40:46,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 00:40:47,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:40:47,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:48,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 00:40:48,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:52,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=541413.3333333334, ans=0.1 2023-09-30 00:40:54,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:40:54,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 00:40:57,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=541413.3333333334, ans=0.0 2023-09-30 00:41:01,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:41:03,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:41:08,046 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 00:41:08,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:08,138 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 00:41:09,146 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.33 vs. limit=15.0 2023-09-30 00:41:10,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:11,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:13,792 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 00:41:15,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:41:18,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 00:41:19,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:22,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:22,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:23,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:24,352 INFO [train.py:1039] (2/4) Epoch 16, batch 1550, loss[loss=0.1698, simple_loss=0.2514, pruned_loss=0.04413, over 24526.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2595, pruned_loss=0.05586, over 4723032.78 frames. ], batch size: 63, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:41:24,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:24,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:41:26,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 00:41:26,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 00:41:26,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:41:27,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 00:41:27,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 00:41:28,679 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.77 vs. limit=15.0 2023-09-30 00:41:30,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:32,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:34,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:41:34,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:41:35,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:35,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:39,569 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 00:41:39,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:41,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:41:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:41:44,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:41:44,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 00:41:44,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=541613.3333333334, ans=0.125 2023-09-30 00:41:46,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:46,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 00:41:48,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 00:41:48,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 00:41:49,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:49,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:41:54,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:54,658 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=541613.3333333334, ans=10.0 2023-09-30 00:41:55,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 00:41:55,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 00:42:06,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:10,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:42:10,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:42:10,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:42:12,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 00:42:17,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:42:18,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:22,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:42:25,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:42:25,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:25,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 00:42:27,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:28,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:42:28,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:30,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:42:30,355 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 00:42:30,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=541813.3333333334, ans=0.125 2023-09-30 00:42:33,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:38,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 00:42:42,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.68 vs. limit=22.5 2023-09-30 00:42:42,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=541813.3333333334, ans=0.0 2023-09-30 00:42:44,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:45,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-09-30 00:42:45,943 INFO [train.py:1039] (2/4) Epoch 16, batch 1600, loss[loss=0.1829, simple_loss=0.2677, pruned_loss=0.04901, over 24662.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2597, pruned_loss=0.05602, over 4722086.49 frames. ], batch size: 73, lr: 6.50e-03, grad_scale: 16.0 2023-09-30 00:42:46,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:46,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 00:42:46,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:48,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:48,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:42:48,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:42:49,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:42:53,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:54,298 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.801e+02 1.974e+02 2.195e+02 3.172e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 00:42:54,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 00:42:56,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 00:42:59,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 00:43:02,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:04,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 00:43:04,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:43:07,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:43:11,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:43:13,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 00:43:16,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:43:17,046 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.47 vs. limit=12.0 2023-09-30 00:43:17,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 00:43:17,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:19,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 00:43:25,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 00:43:33,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:33,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 00:43:34,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:35,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:35,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:43:35,663 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.45 vs. limit=15.0 2023-09-30 00:43:38,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 00:43:41,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 00:43:42,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:43:44,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:43:47,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:43:48,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:43:50,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:43:51,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=542146.6666666666, ans=0.1 2023-09-30 00:43:57,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:59,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:43:59,945 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.86 vs. limit=15.0 2023-09-30 00:44:02,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 00:44:02,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:44:03,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 00:44:06,897 INFO [train.py:1039] (2/4) Epoch 16, batch 1650, loss[loss=0.1807, simple_loss=0.2525, pruned_loss=0.05441, over 23762.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2594, pruned_loss=0.05602, over 4729921.35 frames. ], batch size: 149, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:44:10,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:11,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:11,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:44:11,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 00:44:11,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 00:44:11,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 00:44:11,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 00:44:14,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:44:15,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:15,649 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.12 vs. limit=15.0 2023-09-30 00:44:16,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:44:16,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:44:19,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:20,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=542213.3333333334, ans=0.125 2023-09-30 00:44:21,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 00:44:21,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=542280.0, ans=0.125 2023-09-30 00:44:22,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:44:24,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:24,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:44:24,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:44:27,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 00:44:27,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 00:44:32,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:44:34,068 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=542280.0, ans=0.1 2023-09-30 00:44:35,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:44:42,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 00:44:42,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:44,573 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=15.0 2023-09-30 00:44:45,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 00:44:46,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:44:49,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:44:51,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:44:51,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:44:52,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:52,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:56,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:56,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:58,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:44:58,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:00,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:01,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:45:01,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=542413.3333333334, ans=0.1 2023-09-30 00:45:05,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:45:07,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 00:45:07,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:09,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 00:45:11,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 00:45:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 00:45:11,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:12,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:45:12,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:12,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:45:12,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 00:45:14,344 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:45:17,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:18,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:45:18,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:21,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 00:45:26,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:26,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:45:26,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 00:45:28,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:28,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:45:28,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:30,571 INFO [train.py:1039] (2/4) Epoch 16, batch 1700, loss[loss=0.2077, simple_loss=0.2846, pruned_loss=0.06543, over 23476.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2587, pruned_loss=0.05594, over 4715617.73 frames. ], batch size: 94, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:45:32,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:45:33,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:45:33,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 00:45:37,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:45:40,398 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.927e+02 2.213e+02 2.603e+02 4.204e+02, threshold=4.426e+02, percent-clipped=1.0 2023-09-30 00:45:45,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:48,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:45:48,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=542613.3333333334, ans=0.1 2023-09-30 00:45:52,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=542613.3333333334, ans=0.125 2023-09-30 00:45:53,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:45:53,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:45:55,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:55,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:45:58,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 00:46:01,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:46:01,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:02,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:46:04,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:46:06,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 00:46:08,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 00:46:10,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:12,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 00:46:13,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:46:20,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:22,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:22,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:46:23,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:46:23,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 00:46:24,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:46:26,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:26,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 00:46:27,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=542746.6666666666, ans=0.025 2023-09-30 00:46:28,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:46:28,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:28,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:28,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:28,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=542746.6666666666, ans=0.1 2023-09-30 00:46:32,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:32,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:46:33,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:33,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=542746.6666666666, ans=0.125 2023-09-30 00:46:35,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:46:35,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:40,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:41,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 00:46:44,169 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=542813.3333333334, ans=0.2 2023-09-30 00:46:45,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:45,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:45,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=542813.3333333334, ans=0.125 2023-09-30 00:46:47,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 00:46:53,069 INFO [train.py:1039] (2/4) Epoch 16, batch 1750, loss[loss=0.1924, simple_loss=0.2678, pruned_loss=0.05855, over 23744.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2577, pruned_loss=0.05508, over 4717834.01 frames. ], batch size: 85, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:46:53,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:54,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=542880.0, ans=0.0 2023-09-30 00:46:58,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:58,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:46:58,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 00:46:58,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:47:00,812 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.74 vs. limit=15.0 2023-09-30 00:47:01,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:47:01,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:01,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=542880.0, ans=0.125 2023-09-30 00:47:04,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=542880.0, ans=0.2 2023-09-30 00:47:06,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 00:47:08,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:10,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 00:47:10,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:11,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:47:14,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=542946.6666666666, ans=0.0 2023-09-30 00:47:15,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:47:16,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 00:47:18,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:47:18,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 00:47:21,283 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.89 vs. limit=15.0 2023-09-30 00:47:28,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:47:32,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:47:32,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:35,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:35,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:38,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:47:38,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:43,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:43,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:44,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 00:47:47,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:50,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 00:47:50,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:47:51,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:53,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:47:57,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:47:57,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:47:57,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:48:00,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:48:03,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:06,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:08,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:48:08,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 00:48:08,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:10,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:48:10,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:10,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:48:10,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:48:11,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:48:14,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:48:16,441 INFO [train.py:1039] (2/4) Epoch 16, batch 1800, loss[loss=0.1813, simple_loss=0.2511, pruned_loss=0.05577, over 23886.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2567, pruned_loss=0.05512, over 4695615.62 frames. ], batch size: 195, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:48:17,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:48:19,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:48:21,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:23,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=543213.3333333334, ans=0.125 2023-09-30 00:48:26,084 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.882e+02 2.134e+02 2.523e+02 4.257e+02, threshold=4.267e+02, percent-clipped=0.0 2023-09-30 00:48:26,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 00:48:26,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:48:31,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:34,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:34,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:36,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:48:39,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:39,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 00:48:41,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:44,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:48,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 00:48:50,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 00:48:50,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 00:48:50,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:52,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:52,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:52,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:49:00,858 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 00:49:03,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:49:05,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:07,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=543413.3333333334, ans=0.025 2023-09-30 00:49:08,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 00:49:08,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 00:49:08,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:49:08,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:49:09,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=543413.3333333334, ans=0.1 2023-09-30 00:49:11,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:49:11,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=543413.3333333334, ans=0.0 2023-09-30 00:49:16,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 00:49:23,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:49:24,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 00:49:24,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:49:24,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:24,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:49:26,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 00:49:29,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:49:29,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:49:34,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 00:49:34,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:37,839 INFO [train.py:1039] (2/4) Epoch 16, batch 1850, loss[loss=0.1718, simple_loss=0.2518, pruned_loss=0.04585, over 24666.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2578, pruned_loss=0.05517, over 4712714.20 frames. ], batch size: 65, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:49:37,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:37,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:49:37,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:49:42,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:49:42,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:46,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:49:48,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:49:55,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:49:55,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 00:49:59,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 00:50:02,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 00:50:06,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:06,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 00:50:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:50:17,474 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=543680.0, ans=0.125 2023-09-30 00:50:18,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:50:20,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 00:50:23,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:50:24,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:28,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 00:50:29,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:29,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:50:29,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:50:30,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:50:33,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:50:37,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:50:37,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:38,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:50:38,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:40,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:42,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:50:45,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 00:50:45,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:50,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:50:50,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:50:50,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 00:50:50,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 00:50:51,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=543813.3333333334, ans=0.1 2023-09-30 00:50:52,864 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 00:50:54,907 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 00:50:56,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:50:56,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:56,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:50:57,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:59,447 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 00:50:59,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:50:59,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:59,834 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=543880.0, ans=0.1 2023-09-30 00:51:00,950 INFO [train.py:1039] (2/4) Epoch 16, batch 1900, loss[loss=0.1886, simple_loss=0.2674, pruned_loss=0.0549, over 24027.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2587, pruned_loss=0.05537, over 4702417.25 frames. ], batch size: 80, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:51:01,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:51:02,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:51:02,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:51:02,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 00:51:05,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:05,824 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 00:51:05,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:51:07,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:10,300 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.936e+02 2.154e+02 2.566e+02 3.893e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-30 00:51:12,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:15,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:51:17,015 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 00:51:17,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 00:51:19,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:51:20,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:51:20,104 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 00:51:22,118 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 00:51:25,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 00:51:27,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:51:27,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=543946.6666666666, ans=0.125 2023-09-30 00:51:31,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 00:51:34,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 00:51:45,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 00:51:46,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 00:51:46,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:48,334 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 00:51:48,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 00:51:48,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 00:51:49,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 00:51:49,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:51:54,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 00:51:58,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:52:00,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:00,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 00:52:02,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:52:05,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 00:52:05,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:12,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:52:12,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:52:12,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:52:12,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:52:13,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:52:13,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 00:52:15,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:52:18,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:18,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:21,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:52:21,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:21,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:23,317 INFO [train.py:1039] (2/4) Epoch 16, batch 1950, loss[loss=0.1968, simple_loss=0.2596, pruned_loss=0.06705, over 23802.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2599, pruned_loss=0.05621, over 4709867.72 frames. ], batch size: 164, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:52:23,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:26,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:30,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:52:30,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:30,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:52:31,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 00:52:33,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:52:33,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:35,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:37,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:52:37,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:37,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:40,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:52:40,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=544280.0, ans=0.125 2023-09-30 00:52:45,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:45,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:52:45,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:52:45,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:45,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=544280.0, ans=0.1 2023-09-30 00:52:49,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:50,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=544280.0, ans=0.125 2023-09-30 00:52:53,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:53,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:54,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:52:54,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 00:52:54,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:52:55,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:52:55,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:59,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:02,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:53:09,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:53:14,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:53:15,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:53:15,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 00:53:16,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:19,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:53:21,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:53:22,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:23,219 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=544413.3333333334, ans=0.0 2023-09-30 00:53:29,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:30,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:31,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=544480.0, ans=0.0 2023-09-30 00:53:32,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:34,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:37,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:53:37,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:39,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 00:53:39,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:53:41,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:41,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 00:53:42,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=544480.0, ans=0.125 2023-09-30 00:53:44,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:53:46,222 INFO [train.py:1039] (2/4) Epoch 16, batch 2000, loss[loss=0.175, simple_loss=0.2408, pruned_loss=0.05466, over 23644.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2601, pruned_loss=0.05662, over 4713222.35 frames. ], batch size: 232, lr: 6.49e-03, grad_scale: 16.0 2023-09-30 00:53:47,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:49,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:53:49,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:51,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:53:54,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:55,962 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.874e+02 2.052e+02 2.476e+02 4.888e+02, threshold=4.104e+02, percent-clipped=2.0 2023-09-30 00:53:57,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 00:53:57,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:54:00,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:54:01,929 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.05 vs. limit=5.0 2023-09-30 00:54:03,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 00:54:03,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:54:05,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:54:05,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=544613.3333333334, ans=0.125 2023-09-30 00:54:08,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:54:08,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=544613.3333333334, ans=0.025 2023-09-30 00:54:10,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 00:54:12,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:16,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 00:54:16,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:54:17,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 00:54:17,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:20,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:54:21,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=544680.0, ans=0.2 2023-09-30 00:54:22,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:54:22,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:22,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:24,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:24,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 00:54:26,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 00:54:26,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:26,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:29,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=544680.0, ans=0.1 2023-09-30 00:54:33,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:35,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:54:35,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:36,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:54:39,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:41,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:41,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:41,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:42,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:46,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:46,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 00:54:52,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:54:52,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:54,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=544813.3333333334, ans=0.0 2023-09-30 00:54:57,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:57,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:55:02,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:02,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:02,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:03,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:55:03,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:55:05,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=544813.3333333334, ans=0.0 2023-09-30 00:55:06,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:07,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:08,461 INFO [train.py:1039] (2/4) Epoch 16, batch 2050, loss[loss=0.1801, simple_loss=0.2456, pruned_loss=0.05724, over 23509.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2599, pruned_loss=0.05595, over 4728046.66 frames. ], batch size: 134, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:55:10,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:11,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:13,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=544880.0, ans=0.07 2023-09-30 00:55:15,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=544880.0, ans=0.1 2023-09-30 00:55:18,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:55:21,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:55:23,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:23,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:55:23,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=544946.6666666666, ans=0.125 2023-09-30 00:55:24,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 00:55:24,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:55:27,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:55:27,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:55:31,932 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:55:38,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:38,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:39,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 00:55:42,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:44,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 00:55:44,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:47,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:49,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:55:51,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:55:52,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:54,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:55:54,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:55:54,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:55:54,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=545013.3333333334, ans=0.0 2023-09-30 00:56:00,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:01,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:56:03,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:56:04,111 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.89 vs. limit=10.0 2023-09-30 00:56:04,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:08,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:13,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:56:14,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 00:56:18,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=545146.6666666666, ans=0.125 2023-09-30 00:56:18,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=545146.6666666666, ans=0.125 2023-09-30 00:56:18,366 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=545146.6666666666, ans=0.04949747468305833 2023-09-30 00:56:19,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:21,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:56:23,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:56:26,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 00:56:31,376 INFO [train.py:1039] (2/4) Epoch 16, batch 2100, loss[loss=0.204, simple_loss=0.2663, pruned_loss=0.07084, over 23838.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2584, pruned_loss=0.05593, over 4717734.45 frames. ], batch size: 179, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:56:31,448 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 00:56:31,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:31,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:31,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=545213.3333333334, ans=0.2 2023-09-30 00:56:33,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:34,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:34,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 00:56:34,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 00:56:37,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:41,270 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.889e+02 2.067e+02 2.438e+02 3.667e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 00:56:41,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:56:41,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:56:44,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:46,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:56:46,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 00:56:47,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:56:47,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 00:56:47,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 00:56:49,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:56:50,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:56:50,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 00:56:50,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:56:55,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 00:56:55,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:58,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:58,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:57:04,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:57:04,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 00:57:05,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=545346.6666666666, ans=0.125 2023-09-30 00:57:06,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:06,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:57:08,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 00:57:09,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:09,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 00:57:10,616 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.94 vs. limit=15.0 2023-09-30 00:57:11,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 00:57:11,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 00:57:14,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:57:16,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:57:19,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:20,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:22,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:23,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:23,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 00:57:23,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:23,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:25,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:25,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 00:57:25,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=545413.3333333334, ans=0.1 2023-09-30 00:57:26,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 00:57:28,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 00:57:30,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:57:31,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=545413.3333333334, ans=0.5 2023-09-30 00:57:34,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:57:34,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 00:57:41,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:42,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=545480.0, ans=0.0 2023-09-30 00:57:44,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:57:46,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:57:46,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:57:46,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:57:46,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:57:47,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.69 vs. limit=22.5 2023-09-30 00:57:48,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:48,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:57:49,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:57:49,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:51,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 00:57:53,087 INFO [train.py:1039] (2/4) Epoch 16, batch 2150, loss[loss=0.1908, simple_loss=0.2577, pruned_loss=0.06191, over 23511.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2576, pruned_loss=0.05535, over 4712213.79 frames. ], batch size: 256, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:57:53,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 00:57:53,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:57:57,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:57,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:57:57,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:57:58,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:58:05,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:58:06,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:08,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:09,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:58:09,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:11,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:58:14,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:14,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:58:14,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:58:18,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:18,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 00:58:23,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:25,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:58:25,979 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.58 vs. limit=22.5 2023-09-30 00:58:28,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:28,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:28,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:29,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:58:29,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:29,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:58:29,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:58:31,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 00:58:32,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:58:32,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:34,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:35,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:58:37,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:58:38,285 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.42 vs. limit=6.0 2023-09-30 00:58:38,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:39,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:58:39,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.99 vs. limit=10.0 2023-09-30 00:58:40,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:40,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 00:58:40,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:58:44,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:44,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:46,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:47,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:58:49,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:49,756 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=545746.6666666666, ans=0.1 2023-09-30 00:58:51,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:51,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 00:58:52,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 00:58:52,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:58:54,095 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 00:58:54,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:54,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:58:55,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 00:58:55,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:58:55,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 00:58:55,783 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 00:58:55,784 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 00:58:57,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 00:58:59,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:00,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:59:01,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:59:02,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:04,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:59:04,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:05,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:09,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=545813.3333333334, ans=0.0 2023-09-30 00:59:13,292 INFO [train.py:1039] (2/4) Epoch 16, batch 2200, loss[loss=0.1913, simple_loss=0.2606, pruned_loss=0.06098, over 23692.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2582, pruned_loss=0.05536, over 4719784.89 frames. ], batch size: 232, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:59:13,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:59:13,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 00:59:17,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:59:24,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:24,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:59:24,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:59:25,922 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.877e+02 2.145e+02 2.548e+02 4.503e+02, threshold=4.290e+02, percent-clipped=1.0 2023-09-30 00:59:26,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:59:27,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:29,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:59:29,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 00:59:33,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 00:59:36,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:59:42,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 00:59:42,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=545946.6666666666, ans=0.2 2023-09-30 00:59:43,126 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.70 vs. limit=22.5 2023-09-30 00:59:44,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:45,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:59:45,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:59:49,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:59:50,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 00:59:53,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:59:54,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:56,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:59:56,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=546013.3333333334, ans=0.125 2023-09-30 00:59:59,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:00:00,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:01,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=546013.3333333334, ans=0.125 2023-09-30 01:00:03,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:00:04,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:07,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 01:00:09,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:09,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 01:00:10,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:12,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:00:12,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:13,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:00:13,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:13,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:00:16,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:00:18,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:00:22,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 01:00:22,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=546146.6666666666, ans=0.1 2023-09-30 01:00:23,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:00:25,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:00:28,092 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 01:00:29,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:00:31,218 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 01:00:31,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=546146.6666666666, ans=0.125 2023-09-30 01:00:32,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:00:32,895 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 01:00:34,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:34,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:00:37,301 INFO [train.py:1039] (2/4) Epoch 16, batch 2250, loss[loss=0.1697, simple_loss=0.2403, pruned_loss=0.04959, over 24295.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2591, pruned_loss=0.05567, over 4723967.59 frames. ], batch size: 56, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 01:00:37,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:39,553 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 01:00:41,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:00:42,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:48,352 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.00 vs. limit=15.0 2023-09-30 01:00:48,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:00:51,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:00:53,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:00:55,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:00:55,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:56,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=546280.0, ans=0.1 2023-09-30 01:00:58,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 01:00:58,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:58,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:01:00,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=546280.0, ans=0.125 2023-09-30 01:01:02,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 01:01:03,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:01:03,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:04,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:01:08,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:10,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:01:10,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:01:12,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 01:01:14,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:15,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:01:20,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:21,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:24,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:01:24,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:01:26,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:27,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:01:29,556 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.98 vs. limit=15.0 2023-09-30 01:01:32,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:01:34,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:01:40,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:01:40,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:01:40,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:01:50,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:01:52,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:01:52,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 01:01:52,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:52,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:01:55,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 01:01:57,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:01:57,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:59,838 INFO [train.py:1039] (2/4) Epoch 16, batch 2300, loss[loss=0.2088, simple_loss=0.2766, pruned_loss=0.07048, over 22683.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2598, pruned_loss=0.05586, over 4730009.74 frames. ], batch size: 322, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:02:06,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:06,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:02:10,577 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 01:02:11,887 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.854e+02 2.032e+02 2.223e+02 2.869e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 01:02:12,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:12,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=546546.6666666666, ans=0.0 2023-09-30 01:02:18,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=546613.3333333334, ans=0.0 2023-09-30 01:02:19,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:02:19,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:02:19,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:20,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:20,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 01:02:21,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:02:23,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:24,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:02:29,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:02:31,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:02:34,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:39,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:02:41,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:43,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:02:46,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:50,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:51,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:02:51,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=546746.6666666666, ans=0.0 2023-09-30 01:02:52,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:02:52,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 01:02:57,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:02:57,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:57,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:57,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:02:57,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:02:59,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:02:59,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:02:59,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 01:03:01,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:03:01,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:01,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 01:03:06,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:03:09,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:03:14,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:03:14,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:03:16,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:03:18,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:03:18,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:03:19,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:03:21,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 01:03:22,685 INFO [train.py:1039] (2/4) Epoch 16, batch 2350, loss[loss=0.18, simple_loss=0.2446, pruned_loss=0.05768, over 22685.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2606, pruned_loss=0.05639, over 4708037.04 frames. ], batch size: 322, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:03:26,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:03:26,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 01:03:30,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 01:03:33,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:36,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=546880.0, ans=0.1 2023-09-30 01:03:37,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:39,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:03:40,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 01:03:43,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=546946.6666666666, ans=0.05 2023-09-30 01:03:46,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:03:50,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=546946.6666666666, ans=0.125 2023-09-30 01:03:51,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 01:03:54,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:59,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:03:59,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:04:00,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:04:01,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 01:04:02,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:04:05,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:04:05,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:05,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:04:09,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:04:12,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 01:04:12,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:04:15,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:04:15,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:04:16,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 01:04:18,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:04:18,430 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=547080.0, ans=0.0 2023-09-30 01:04:18,516 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:04:22,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 01:04:22,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:04:27,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 01:04:31,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 01:04:32,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:32,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:04:32,697 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 01:04:32,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 01:04:37,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 01:04:38,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=547146.6666666666, ans=0.2 2023-09-30 01:04:40,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:04:43,668 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=547213.3333333334, ans=0.125 2023-09-30 01:04:44,706 INFO [train.py:1039] (2/4) Epoch 16, batch 2400, loss[loss=0.1565, simple_loss=0.2364, pruned_loss=0.0383, over 24372.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2599, pruned_loss=0.05618, over 4715407.34 frames. ], batch size: 61, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:04:44,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:04:46,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=547213.3333333334, ans=0.125 2023-09-30 01:04:48,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:04:51,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:04:53,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 01:04:53,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 01:04:56,005 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.435e+02 1.899e+02 2.114e+02 2.474e+02 3.602e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 01:05:00,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:05:00,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:03,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 01:05:03,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:05:05,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:05,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=547280.0, ans=0.0 2023-09-30 01:05:07,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 01:05:10,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:12,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 01:05:18,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:05:21,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 01:05:23,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:05:25,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:29,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:05:31,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 01:05:31,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:05:32,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=547346.6666666666, ans=0.125 2023-09-30 01:05:41,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:41,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=547413.3333333334, ans=0.2 2023-09-30 01:05:42,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:05:47,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:50,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:05:50,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:05:50,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:05:50,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:50,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:05:50,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:05:57,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:05:58,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:05:58,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 01:06:00,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 01:06:01,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:06:01,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:06:01,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 01:06:03,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 01:06:03,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 01:06:03,290 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 01:06:04,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 01:06:06,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:06:08,315 INFO [train.py:1039] (2/4) Epoch 16, batch 2450, loss[loss=0.1933, simple_loss=0.2682, pruned_loss=0.05917, over 23299.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2587, pruned_loss=0.05555, over 4700056.93 frames. ], batch size: 93, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:06:09,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:09,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:10,582 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 01:06:10,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:12,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:06:15,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:06:15,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:19,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:19,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:20,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 01:06:20,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=547546.6666666666, ans=0.1 2023-09-30 01:06:25,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=547613.3333333334, ans=0.125 2023-09-30 01:06:26,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:06:26,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:30,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:06:30,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:06:30,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:06:31,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 01:06:33,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:36,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:06:37,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=547613.3333333334, ans=0.2 2023-09-30 01:06:38,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:06:44,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:06:44,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:48,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 01:06:48,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:06:55,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=547680.0, ans=0.125 2023-09-30 01:06:57,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:58,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:58,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:00,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:07:00,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:00,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:07:01,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 01:07:02,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=547746.6666666666, ans=0.025 2023-09-30 01:07:05,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:07:06,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:07:10,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:07:10,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:16,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:07:16,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 01:07:18,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:07:20,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:07:20,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 01:07:22,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:07:22,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:07:26,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:07:27,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:29,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:07:30,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 01:07:31,128 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:07:32,333 INFO [train.py:1039] (2/4) Epoch 16, batch 2500, loss[loss=0.1878, simple_loss=0.2302, pruned_loss=0.07265, over 18875.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2578, pruned_loss=0.05475, over 4713529.39 frames. ], batch size: 388, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:07:32,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:07:37,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:41,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=547880.0, ans=0.125 2023-09-30 01:07:44,694 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.913e+02 2.172e+02 2.502e+02 3.550e+02, threshold=4.344e+02, percent-clipped=0.0 2023-09-30 01:07:47,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:07:47,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:49,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:49,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 01:07:57,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:07:59,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:01,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:08:01,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:08:01,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 01:08:03,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=547946.6666666666, ans=0.125 2023-09-30 01:08:04,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:04,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:05,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 01:08:06,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:07,080 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.38 vs. limit=15.0 2023-09-30 01:08:07,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 01:08:07,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:09,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=548013.3333333334, ans=0.0 2023-09-30 01:08:12,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:08:14,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:17,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:08:17,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 01:08:18,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:08:20,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:24,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:28,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=548080.0, ans=0.1 2023-09-30 01:08:29,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:31,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:37,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:08:37,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=548146.6666666666, ans=0.2 2023-09-30 01:08:37,801 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=548146.6666666666, ans=0.125 2023-09-30 01:08:40,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 01:08:40,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:40,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:08:42,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:08:42,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:08:43,695 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 01:08:43,695 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 01:08:43,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 01:08:48,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:50,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 01:08:50,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 01:08:51,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:51,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 01:08:52,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=548146.6666666666, ans=0.125 2023-09-30 01:08:54,998 INFO [train.py:1039] (2/4) Epoch 16, batch 2550, loss[loss=0.1776, simple_loss=0.2544, pruned_loss=0.05044, over 23704.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.258, pruned_loss=0.05491, over 4712062.55 frames. ], batch size: 149, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:08:55,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 01:08:58,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:00,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:09:02,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:09:03,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:05,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 01:09:06,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:09:10,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 01:09:10,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:09:12,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:15,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:09:15,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 01:09:17,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:17,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:18,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:20,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:09:20,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 01:09:21,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:09:21,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:21,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 01:09:34,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:09:41,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:41,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:41,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:43,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:09:51,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:54,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:54,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:09:54,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:09:54,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:09:54,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:09:58,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:58,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:03,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:10:03,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 01:10:03,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:10:05,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:06,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:10:07,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:10:10,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:16,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:10:18,252 INFO [train.py:1039] (2/4) Epoch 16, batch 2600, loss[loss=0.1919, simple_loss=0.2626, pruned_loss=0.06057, over 23508.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2587, pruned_loss=0.05543, over 4706775.83 frames. ], batch size: 106, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:10:19,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:22,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=548546.6666666666, ans=15.0 2023-09-30 01:10:24,924 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 01:10:26,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 01:10:26,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:10:26,625 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 01:10:28,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 01:10:28,173 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 01:10:30,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=548546.6666666666, ans=0.2 2023-09-30 01:10:31,178 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.861e+02 2.102e+02 2.275e+02 3.590e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-30 01:10:31,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:10:31,388 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 01:10:32,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 01:10:34,369 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 01:10:37,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:10:38,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=548613.3333333334, ans=0.125 2023-09-30 01:10:40,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 01:10:41,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 01:10:44,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:10:44,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 01:10:44,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=548613.3333333334, ans=0.5 2023-09-30 01:10:48,105 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 01:10:48,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 01:10:51,824 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=548680.0, ans=0.125 2023-09-30 01:10:54,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:10:54,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:54,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:10:54,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 01:10:58,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:11:04,513 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 01:11:09,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:09,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:10,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 01:11:10,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:10,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:11:12,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 01:11:16,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:11:16,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:11:17,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=548746.6666666666, ans=0.125 2023-09-30 01:11:19,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,816 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 01:11:22,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:11:27,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:27,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=548813.3333333334, ans=0.0 2023-09-30 01:11:29,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:11:29,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 01:11:31,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:32,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:11:34,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:11:37,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=548813.3333333334, ans=0.125 2023-09-30 01:11:40,242 INFO [train.py:1039] (2/4) Epoch 16, batch 2650, loss[loss=0.2044, simple_loss=0.2725, pruned_loss=0.06816, over 23455.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2594, pruned_loss=0.05584, over 4710270.01 frames. ], batch size: 285, lr: 6.46e-03, grad_scale: 4.0 2023-09-30 01:11:40,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 01:11:40,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:43,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:11:48,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 01:11:48,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:49,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:11:49,105 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 01:11:51,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:11:54,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:55,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:11:57,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:12:00,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:12:02,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 01:12:02,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:12:02,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:12:06,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 01:12:07,417 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 01:12:10,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:12,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 01:12:13,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:13,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 01:12:17,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:17,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:12:17,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:18,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:23,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 01:12:25,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 01:12:28,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:12:28,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=549013.3333333334, ans=0.0 2023-09-30 01:12:32,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 01:12:32,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:34,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:34,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:12:35,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:35,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:37,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:39,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:40,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:12:40,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:12:42,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:12:44,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:44,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:12:45,341 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.81 vs. limit=12.0 2023-09-30 01:12:46,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:47,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:49,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:12:49,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=549146.6666666666, ans=0.1 2023-09-30 01:12:52,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:52,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:12:52,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:53,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 01:12:57,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:59,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:00,228 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.033e-02 2023-09-30 01:13:03,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:03,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:04,923 INFO [train.py:1039] (2/4) Epoch 16, batch 2700, loss[loss=0.1961, simple_loss=0.2563, pruned_loss=0.06794, over 23717.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2606, pruned_loss=0.05648, over 4713287.85 frames. ], batch size: 232, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:13:06,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:13:06,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:06,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=549213.3333333334, ans=0.0 2023-09-30 01:13:08,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:08,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 01:13:11,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:13:11,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:13:11,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=549213.3333333334, ans=0.125 2023-09-30 01:13:16,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:13:16,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:16,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:18,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:13:18,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:13:18,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:13:18,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:13:18,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 01:13:19,481 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.941e+02 2.174e+02 2.573e+02 4.504e+02, threshold=4.348e+02, percent-clipped=1.0 2023-09-30 01:13:19,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:13:22,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:13:22,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:13:22,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:25,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:13:27,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 01:13:28,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:13:33,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:13:33,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:13:34,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=549280.0, ans=0.0 2023-09-30 01:13:34,798 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-09-30 01:13:40,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:13:40,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:40,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:13:40,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=549346.6666666666, ans=0.125 2023-09-30 01:13:41,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:13:43,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:13:48,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:13:48,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:13:48,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:13:51,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:51,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:13:59,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:14:01,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:02,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:14:02,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:08,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:09,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:11,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:14:12,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:14,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:14,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:16,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:14:18,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:18,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:19,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.92 vs. limit=22.5 2023-09-30 01:14:22,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 01:14:22,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:25,720 INFO [train.py:1039] (2/4) Epoch 16, batch 2750, loss[loss=0.1731, simple_loss=0.2313, pruned_loss=0.05747, over 23569.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2607, pruned_loss=0.05645, over 4713360.04 frames. ], batch size: 256, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:14:25,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:14:27,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 01:14:28,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 01:14:28,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:32,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:33,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:35,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:35,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:14:36,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:40,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:14:40,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:14:40,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:14:40,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:40,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 01:14:42,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:14:42,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:44,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=549613.3333333334, ans=10.0 2023-09-30 01:14:47,866 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.50 vs. limit=22.5 2023-09-30 01:14:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 01:14:50,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:50,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:50,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:52,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:14:52,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:53,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:14:53,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:53,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:54,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=549613.3333333334, ans=0.125 2023-09-30 01:14:58,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:15:00,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:15:00,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:15:01,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:01,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:15:08,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:15:10,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:15:10,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:17,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:17,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:15:18,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:15:23,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:15:25,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:15:25,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 01:15:30,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:32,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 01:15:37,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:15:38,703 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.04 vs. limit=15.0 2023-09-30 01:15:39,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:15:40,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 01:15:41,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:15:42,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:15:42,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 01:15:44,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:15:45,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.81 vs. limit=15.0 2023-09-30 01:15:45,603 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.69 vs. limit=15.0 2023-09-30 01:15:47,510 INFO [train.py:1039] (2/4) Epoch 16, batch 2800, loss[loss=0.1841, simple_loss=0.2583, pruned_loss=0.05496, over 23496.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2592, pruned_loss=0.05621, over 4705403.92 frames. ], batch size: 134, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:15:47,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:15:47,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:15:47,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:15:49,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 01:15:49,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:51,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:52,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:54,396 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 01:15:54,397 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 01:15:57,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:16:00,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:16:00,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:16:01,933 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.870e+02 2.056e+02 2.355e+02 4.086e+02, threshold=4.112e+02, percent-clipped=0.0 2023-09-30 01:16:02,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:16:04,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 01:16:07,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:16:07,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=549946.6666666666, ans=0.1 2023-09-30 01:16:08,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 01:16:10,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:10,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:16:10,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:13,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:15,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:15,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:16:16,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:24,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=550013.3333333334, ans=0.125 2023-09-30 01:16:26,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:16:28,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:16:29,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:31,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:16:31,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:36,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:36,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 01:16:36,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:38,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:38,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:16:44,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=550080.0, ans=10.0 2023-09-30 01:16:45,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:45,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:48,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:50,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:16:50,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:50,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:16:52,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:16:52,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:16:54,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:54,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 01:16:54,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:54,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=550146.6666666666, ans=0.04949747468305833 2023-09-30 01:16:56,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:56,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:58,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 01:16:59,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:01,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:17:01,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:17:04,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 01:17:10,298 INFO [train.py:1039] (2/4) Epoch 16, batch 2850, loss[loss=0.1784, simple_loss=0.2632, pruned_loss=0.04681, over 23738.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2573, pruned_loss=0.05504, over 4702959.86 frames. ], batch size: 85, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:17:10,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:17:10,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:17:10,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:17:12,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:15,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:15,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:17:17,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:17:19,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:20,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:17:22,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:17:23,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 01:17:29,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 01:17:29,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=550280.0, ans=0.125 2023-09-30 01:17:31,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:31,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 01:17:33,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:36,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 01:17:37,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 01:17:39,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:41,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=550280.0, ans=0.125 2023-09-30 01:17:50,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:52,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:17:52,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=550346.6666666666, ans=0.125 2023-09-30 01:17:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:55,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:17:55,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:17:55,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:17:56,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:17:58,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 01:17:58,631 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=550413.3333333334, ans=0.0 2023-09-30 01:17:59,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:17:59,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:01,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:01,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:05,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:05,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:07,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:09,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:18:10,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:18:12,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:14,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:15,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:18:17,952 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.76 vs. limit=15.0 2023-09-30 01:18:18,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:18:21,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 01:18:22,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 01:18:24,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:18:24,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 01:18:25,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:18:25,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:27,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:18:27,266 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 01:18:27,327 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 01:18:27,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:27,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:33,166 INFO [train.py:1039] (2/4) Epoch 16, batch 2900, loss[loss=0.2005, simple_loss=0.2635, pruned_loss=0.06875, over 23446.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2572, pruned_loss=0.05495, over 4706125.77 frames. ], batch size: 285, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:18:33,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:18:33,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=550546.6666666666, ans=0.025 2023-09-30 01:18:35,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:35,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:37,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 01:18:41,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:41,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 01:18:43,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 01:18:45,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:18:46,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:18:47,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:48,255 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.959e+02 2.363e+02 2.831e+02 4.091e+02, threshold=4.726e+02, percent-clipped=0.0 2023-09-30 01:18:48,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:51,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:51,718 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=550613.3333333334, ans=0.0 2023-09-30 01:18:52,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:53,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=550613.3333333334, ans=0.0 2023-09-30 01:18:56,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:18:56,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 01:18:56,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:18:57,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=550613.3333333334, ans=0.125 2023-09-30 01:18:58,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:01,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 01:19:03,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 01:19:04,058 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.46 vs. limit=15.0 2023-09-30 01:19:04,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:19:04,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 01:19:04,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:19:07,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:19:07,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:19:10,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:19:11,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:14,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:19:19,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:21,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 01:19:21,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 01:19:21,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:19:27,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:19:28,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 01:19:29,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:19:29,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=550746.6666666666, ans=0.125 2023-09-30 01:19:35,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:36,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=550746.6666666666, ans=0.0 2023-09-30 01:19:45,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:19:45,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:19:46,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=550813.3333333334, ans=0.2 2023-09-30 01:19:47,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 01:19:50,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:52,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 01:19:52,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:19:52,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:19:55,412 INFO [train.py:1039] (2/4) Epoch 16, batch 2950, loss[loss=0.1648, simple_loss=0.2377, pruned_loss=0.046, over 21448.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2578, pruned_loss=0.05474, over 4712927.84 frames. ], batch size: 46, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:19:59,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:20:00,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 01:20:02,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:02,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:04,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:06,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:20:07,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 01:20:07,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 01:20:08,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:20:08,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:15,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:19,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:20,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:20:20,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:24,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:20:24,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:20:27,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:20:29,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 01:20:29,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=551013.3333333334, ans=0.1 2023-09-30 01:20:34,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 01:20:34,683 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 01:20:36,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:20:37,772 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 01:20:39,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 01:20:39,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:41,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:41,200 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 01:20:41,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:20:44,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 01:20:45,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:45,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:20:47,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:49,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:20:49,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:49,179 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 01:20:51,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:51,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 01:20:58,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:59,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:20:59,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 01:20:59,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:21:01,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 01:21:06,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:08,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:21:09,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:21:11,435 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:21:12,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:21:12,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:21:14,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:21:14,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:16,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:21:16,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:21:17,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:19,000 INFO [train.py:1039] (2/4) Epoch 16, batch 3000, loss[loss=0.1691, simple_loss=0.2465, pruned_loss=0.04584, over 24608.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2583, pruned_loss=0.05506, over 4714807.67 frames. ], batch size: 60, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:21:19,000 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 01:21:34,547 INFO [train.py:1071] (2/4) Epoch 16, validation: loss=0.3091, simple_loss=0.2818, pruned_loss=0.1682, over 1125622.00 frames. 2023-09-30 01:21:34,548 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 01:21:34,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:21:36,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:36,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 01:21:37,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:39,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:21:39,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:21:43,146 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 01:21:44,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 01:21:44,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=551213.3333333334, ans=0.125 2023-09-30 01:21:47,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:21:47,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:21:47,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 01:21:47,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:21:49,607 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.865e+02 2.031e+02 2.277e+02 3.298e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 01:21:55,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:21:59,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=551280.0, ans=0.2 2023-09-30 01:22:05,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:22:13,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 01:22:14,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:22:16,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:22:16,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:22:18,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:22:20,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:20,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 01:22:23,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 01:22:25,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:22:25,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:22:28,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:22:30,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:31,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:31,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:22:33,890 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=551413.3333333334, ans=0.07 2023-09-30 01:22:35,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:22:35,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:35,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:22:38,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:40,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 01:22:42,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:22:43,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:22:43,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:22:45,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:47,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:48,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:22:48,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 01:22:49,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:22:49,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 01:22:50,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:22:52,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 01:22:53,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:22:55,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:22:55,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 01:22:57,406 INFO [train.py:1039] (2/4) Epoch 16, batch 3050, loss[loss=0.1591, simple_loss=0.2462, pruned_loss=0.03595, over 24635.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2593, pruned_loss=0.05547, over 4722557.04 frames. ], batch size: 68, lr: 6.45e-03, grad_scale: 8.0 2023-09-30 01:22:57,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 01:22:57,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:22:59,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:23:01,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:23:01,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:23:01,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:01,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:23:04,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 01:23:05,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:08,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:10,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:23:15,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:15,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=551613.3333333334, ans=0.1 2023-09-30 01:23:19,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 01:23:23,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 01:23:23,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 01:23:23,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:29,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:23:31,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:33,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:33,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:37,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:23:38,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:23:38,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:38,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:38,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:40,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:41,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:45,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:45,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 01:23:45,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:46,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:23:50,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:50,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:23:51,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:23:51,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:23:56,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:58,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:03,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:03,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=551813.3333333334, ans=0.04949747468305833 2023-09-30 01:24:05,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:05,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:06,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:08,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:24:08,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:24:08,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 01:24:10,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:10,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:12,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 01:24:12,901 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.86 vs. limit=15.0 2023-09-30 01:24:15,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:20,248 INFO [train.py:1039] (2/4) Epoch 16, batch 3100, loss[loss=0.1701, simple_loss=0.2579, pruned_loss=0.04119, over 24606.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.259, pruned_loss=0.05532, over 4715138.05 frames. ], batch size: 71, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:24:20,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:22,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:24:25,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:24:26,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 01:24:30,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 01:24:31,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 01:24:33,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:24:35,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:24:36,442 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.437e+02 1.867e+02 2.041e+02 2.309e+02 3.619e+02, threshold=4.081e+02, percent-clipped=0.0 2023-09-30 01:24:36,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:38,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:24:43,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:50,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 01:24:54,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:24:55,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:56,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:24:56,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:56,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:24:59,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:24:59,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 01:24:59,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:25:00,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:02,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 01:25:03,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:06,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:25:07,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 01:25:08,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 01:25:10,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:11,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:13,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:13,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:13,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:25:15,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:25:15,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:25:18,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:25:18,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:25:18,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:18,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:25:23,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:25:23,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 01:25:25,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:25:27,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 01:25:27,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:27,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:27,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 01:25:36,011 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=552146.6666666666, ans=0.0 2023-09-30 01:25:38,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=552146.6666666666, ans=0.2 2023-09-30 01:25:40,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 01:25:41,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:43,536 INFO [train.py:1039] (2/4) Epoch 16, batch 3150, loss[loss=0.1682, simple_loss=0.2493, pruned_loss=0.04357, over 24317.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2577, pruned_loss=0.05478, over 4708355.35 frames. ], batch size: 61, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:25:43,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:44,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=552213.3333333334, ans=0.0 2023-09-30 01:25:45,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:25:45,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:25:46,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 01:25:46,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:47,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:25:49,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 01:25:50,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:52,342 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 01:25:56,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 01:25:56,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:56,954 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 01:25:59,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:25:59,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 01:26:00,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 01:26:00,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 01:26:00,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:00,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:02,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:06,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 01:26:06,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:06,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=552280.0, ans=0.1 2023-09-30 01:26:07,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:07,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:09,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:26:12,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=552280.0, ans=0.125 2023-09-30 01:26:13,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 01:26:15,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:26:18,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:26:19,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:20,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 01:26:23,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 01:26:25,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:26:27,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:26:27,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:26:27,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:27,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:26:28,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:26:28,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:26:28,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=552346.6666666666, ans=0.125 2023-09-30 01:26:30,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 01:26:30,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=552346.6666666666, ans=0.125 2023-09-30 01:26:31,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:26:31,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:33,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:26:33,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:35,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 01:26:35,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:38,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 01:26:38,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:39,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 01:26:42,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 01:26:43,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:26:43,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:45,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 01:26:45,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:26:46,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:50,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:26:51,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:51,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:26:58,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:26:59,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:00,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 01:27:07,194 INFO [train.py:1039] (2/4) Epoch 16, batch 3200, loss[loss=0.1946, simple_loss=0.2622, pruned_loss=0.06354, over 23606.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2563, pruned_loss=0.0549, over 4694120.01 frames. ], batch size: 106, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:27:07,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:27:07,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:27:10,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:11,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:27:11,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 01:27:15,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:27:19,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:27:22,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=552613.3333333334, ans=0.1 2023-09-30 01:27:23,450 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.941e+02 2.200e+02 2.639e+02 4.791e+02, threshold=4.401e+02, percent-clipped=2.0 2023-09-30 01:27:23,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:32,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:27:43,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 01:27:44,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:27:47,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 01:27:48,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:27:51,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:27:51,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:27:53,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:27:56,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 01:27:58,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:28:01,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 01:28:03,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 01:28:05,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:28:09,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=552746.6666666666, ans=0.5 2023-09-30 01:28:11,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:11,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:28:11,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:13,406 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 01:28:13,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:28:18,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:20,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 01:28:20,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 01:28:20,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 01:28:23,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 01:28:25,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:28:27,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:28:27,214 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 01:28:27,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:28:27,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:30,168 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 01:28:31,902 INFO [train.py:1039] (2/4) Epoch 16, batch 3250, loss[loss=0.1819, simple_loss=0.2501, pruned_loss=0.05683, over 23657.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2564, pruned_loss=0.05492, over 4686936.82 frames. ], batch size: 149, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:28:32,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:28:35,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:28:45,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:28:45,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 01:28:47,860 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.75 vs. limit=15.0 2023-09-30 01:28:48,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:49,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:49,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:28:50,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:28:50,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:28:53,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:53,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:28:55,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:28:55,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:28:55,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=552946.6666666666, ans=0.05 2023-09-30 01:28:58,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:00,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:29:02,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:03,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:29:05,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:05,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:29:05,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:12,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 01:29:12,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:29:12,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:29:13,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:15,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:29:22,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:29:30,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:30,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:30,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 01:29:30,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:29:30,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:29:31,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:34,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 01:29:35,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 01:29:35,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:37,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:38,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:40,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:29:40,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:41,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=553146.6666666666, ans=0.05 2023-09-30 01:29:42,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=553146.6666666666, ans=0.0 2023-09-30 01:29:44,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:29:44,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:29:46,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 01:29:46,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:29:49,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:29:49,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 01:29:53,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:53,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 01:29:54,958 INFO [train.py:1039] (2/4) Epoch 16, batch 3300, loss[loss=0.1977, simple_loss=0.267, pruned_loss=0.06414, over 23909.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2571, pruned_loss=0.05545, over 4695501.03 frames. ], batch size: 195, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:29:55,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 01:29:56,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 01:29:56,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:01,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:30:02,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:30:02,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:05,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:30:05,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:30:07,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:08,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:30:11,919 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.919e+02 2.093e+02 2.361e+02 4.091e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 01:30:12,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 01:30:13,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:13,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:16,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:16,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 01:30:18,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:30:19,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:30:19,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:30:19,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:30:19,622 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 01:30:19,846 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=553280.0, ans=0.95 2023-09-30 01:30:25,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:25,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:30:28,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:28,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 01:30:30,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 01:30:30,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:32,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:30:33,672 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 01:30:35,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 01:30:35,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:30:39,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 01:30:39,960 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.86 vs. limit=12.0 2023-09-30 01:30:40,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:30:41,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=553346.6666666666, ans=0.2 2023-09-30 01:30:41,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.89 vs. limit=15.0 2023-09-30 01:30:44,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:30:45,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:30:49,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:50,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:50,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:50,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:30:53,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:30:53,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:53,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:30:55,196 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 01:30:56,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 01:30:59,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:31:00,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:00,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:31:03,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:31:05,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:05,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:31:06,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:08,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:31:10,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 01:31:12,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:14,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:15,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:31:15,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:31:15,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:19,228 INFO [train.py:1039] (2/4) Epoch 16, batch 3350, loss[loss=0.1843, simple_loss=0.2564, pruned_loss=0.05609, over 23505.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2586, pruned_loss=0.05618, over 4694677.67 frames. ], batch size: 134, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:31:19,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:19,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:21,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:31:22,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:22,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=553546.6666666666, ans=0.1 2023-09-30 01:31:24,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:31:27,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:28,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:31:30,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:31,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:31:33,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 01:31:34,748 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 01:31:34,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:37,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=553613.3333333334, ans=0.0 2023-09-30 01:31:38,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 01:31:38,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 01:31:39,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:31:39,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:31:40,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:42,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 01:31:42,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:42,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:31:45,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:47,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:47,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:49,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:31:50,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:55,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:55,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:58,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:59,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:01,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:01,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:03,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:06,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 01:32:06,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:32:07,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 01:32:07,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:32:09,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 01:32:10,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:12,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:20,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:22,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 01:32:23,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:25,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:32:25,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:32:31,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:34,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 01:32:34,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:32:34,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:32:37,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:38,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 01:32:38,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:38,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 01:32:40,373 INFO [train.py:1039] (2/4) Epoch 16, batch 3400, loss[loss=0.2029, simple_loss=0.2824, pruned_loss=0.06169, over 23274.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2601, pruned_loss=0.0573, over 4682586.48 frames. ], batch size: 93, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:32:40,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:40,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:41,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:32:42,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:32:42,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 01:32:46,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 01:32:46,917 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 01:32:46,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:52,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:53,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:53,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=553880.0, ans=0.0 2023-09-30 01:32:54,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:32:56,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:32:58,071 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.849e+02 2.111e+02 2.348e+02 3.492e+02, threshold=4.221e+02, percent-clipped=0.0 2023-09-30 01:33:01,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:04,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 01:33:10,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:33:10,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=553946.6666666666, ans=0.125 2023-09-30 01:33:11,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:12,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:13,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:33:19,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:33:25,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 01:33:32,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:32,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:33,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 01:33:33,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:33:35,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:36,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:36,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:33:40,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:40,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=554080.0, ans=0.125 2023-09-30 01:33:43,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:33:43,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:33:44,383 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.15 vs. limit=15.0 2023-09-30 01:33:51,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:33:52,009 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.95 vs. limit=15.0 2023-09-30 01:33:52,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 01:33:56,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:34:01,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 01:34:03,390 INFO [train.py:1039] (2/4) Epoch 16, batch 3450, loss[loss=0.1889, simple_loss=0.2506, pruned_loss=0.06366, over 23364.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2595, pruned_loss=0.05687, over 4687903.83 frames. ], batch size: 285, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:34:07,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 01:34:07,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:09,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:34:09,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 01:34:10,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:34:13,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=554213.3333333334, ans=0.125 2023-09-30 01:34:15,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:34:21,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:34:21,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:21,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:34:21,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:24,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:29,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 01:34:36,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 01:34:36,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:34:36,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:34:38,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:42,103 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.67 vs. limit=15.0 2023-09-30 01:34:44,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 01:34:44,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:34:48,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:34:49,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:50,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:34:52,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:34:52,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=554413.3333333334, ans=0.95 2023-09-30 01:34:53,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 01:34:53,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:34:55,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:59,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:02,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 01:35:05,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:35:11,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:35:13,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:17,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:22,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:22,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:35:22,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:35:22,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:35:25,298 INFO [train.py:1039] (2/4) Epoch 16, batch 3500, loss[loss=0.1697, simple_loss=0.2102, pruned_loss=0.06457, over 19165.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2586, pruned_loss=0.05651, over 4693743.79 frames. ], batch size: 389, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:35:27,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:30,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:35:30,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 01:35:34,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:35:36,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:35:39,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:39,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 01:35:43,268 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.893e+02 2.078e+02 2.352e+02 3.454e+02, threshold=4.155e+02, percent-clipped=0.0 2023-09-30 01:35:45,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:35:47,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:47,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:35:47,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:35:49,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:35:49,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:51,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:35:51,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 01:35:53,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:53,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:35:56,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:36:00,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:00,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 01:36:00,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:36:04,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:36:05,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:36:05,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:08,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:36:08,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:11,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 01:36:13,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 01:36:13,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 01:36:13,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:16,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:16,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:17,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:36:22,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:36:22,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:36:23,151 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:36:26,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:36:28,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 01:36:28,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 01:36:28,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:36:31,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:32,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:34,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:37,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 01:36:37,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:40,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:41,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 01:36:43,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 01:36:45,832 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.01 vs. limit=10.0 2023-09-30 01:36:46,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:48,096 INFO [train.py:1039] (2/4) Epoch 16, batch 3550, loss[loss=0.1695, simple_loss=0.2587, pruned_loss=0.04017, over 24286.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2569, pruned_loss=0.05587, over 4692671.30 frames. ], batch size: 74, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:36:48,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:48,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:36:48,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:36:48,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=554880.0, ans=0.125 2023-09-30 01:36:52,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:36:59,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:01,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:37:01,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:03,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:37:03,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:04,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:37:04,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:37:09,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:09,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:37:10,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:10,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:37:12,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:37:14,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.62 vs. limit=22.5 2023-09-30 01:37:18,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:37:18,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:20,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:20,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:21,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:37:21,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 01:37:22,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:22,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=555013.3333333334, ans=0.1 2023-09-30 01:37:23,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:24,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:37:31,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:33,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:33,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:35,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 01:37:36,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:37:38,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 01:37:39,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:41,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:37:41,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:37:41,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=555080.0, ans=0.2 2023-09-30 01:37:43,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=555080.0, ans=0.07 2023-09-30 01:37:44,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 01:37:44,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:50,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:52,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 01:37:52,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:37:59,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:59,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 01:38:08,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 01:38:10,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:10,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:38:11,647 INFO [train.py:1039] (2/4) Epoch 16, batch 3600, loss[loss=0.1779, simple_loss=0.2591, pruned_loss=0.04831, over 24644.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2574, pruned_loss=0.05567, over 4696158.34 frames. ], batch size: 68, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:38:11,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:13,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:14,121 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=12.0 2023-09-30 01:38:14,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:38:16,992 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=555213.3333333334, ans=0.0 2023-09-30 01:38:18,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:19,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:21,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:38:22,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:38:22,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:22,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 01:38:25,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:38:27,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:27,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=555280.0, ans=0.0 2023-09-30 01:38:27,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=555280.0, ans=0.0 2023-09-30 01:38:28,709 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 2.004e+02 2.343e+02 2.780e+02 3.954e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-30 01:38:31,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:35,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:37,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:38:39,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:39,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 01:38:39,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:42,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:42,489 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=555346.6666666666, ans=0.2 2023-09-30 01:38:43,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:38:43,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:38:47,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:47,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:38:48,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 01:38:53,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=555346.6666666666, ans=0.125 2023-09-30 01:38:54,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:56,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:38:56,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 01:39:01,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:39:03,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=555413.3333333334, ans=0.125 2023-09-30 01:39:06,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:09,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:15,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:39:15,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:39:15,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 01:39:17,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 01:39:17,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 01:39:20,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:39:20,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:39:22,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 01:39:23,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:23,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:39:23,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:25,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 01:39:26,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 01:39:28,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=555480.0, ans=0.05 2023-09-30 01:39:28,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=555480.0, ans=0.2 2023-09-30 01:39:29,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:30,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 01:39:32,460 INFO [train.py:1039] (2/4) Epoch 16, batch 3650, loss[loss=0.1755, simple_loss=0.2501, pruned_loss=0.05049, over 23345.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2584, pruned_loss=0.05578, over 4704794.48 frames. ], batch size: 119, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:39:36,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 01:39:38,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:39:45,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 01:39:46,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 01:39:49,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:39:49,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:39:49,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:39:55,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:39:55,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:57,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 01:39:57,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:39:57,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:57,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 01:39:59,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:40:00,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:00,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:00,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:40:03,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 01:40:06,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 01:40:06,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:40:08,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 01:40:10,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:10,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:40:12,671 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.19 vs. limit=15.0 2023-09-30 01:40:14,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:40:17,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:17,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:40:19,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:40:20,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:40:20,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:40:25,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:27,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:27,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:28,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:40:28,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:30,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:31,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.32 vs. limit=12.0 2023-09-30 01:40:36,428 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 01:40:36,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=555813.3333333334, ans=0.125 2023-09-30 01:40:39,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:41,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:41,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:40:42,293 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.25 vs. limit=22.5 2023-09-30 01:40:42,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:42,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:40:44,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:44,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 01:40:46,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:48,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=555813.3333333334, ans=0.0 2023-09-30 01:40:49,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:40:53,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:53,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:40:54,843 INFO [train.py:1039] (2/4) Epoch 16, batch 3700, loss[loss=0.1912, simple_loss=0.2666, pruned_loss=0.0579, over 23352.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2596, pruned_loss=0.05606, over 4709692.62 frames. ], batch size: 93, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:40:56,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:56,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 01:40:56,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:58,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:40:59,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:41:01,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:41:05,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:05,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:07,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=555880.0, ans=0.125 2023-09-30 01:41:08,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:41:08,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:41:08,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:41:11,794 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.928e+02 2.143e+02 2.451e+02 3.411e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 01:41:12,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:15,005 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 01:41:21,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:41:23,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:41:23,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:41:23,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 01:41:23,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:28,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=556013.3333333334, ans=0.125 2023-09-30 01:41:30,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:31,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 01:41:32,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:34,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:41:37,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:37,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:41:39,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:41:43,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:43,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 01:41:44,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:45,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 01:41:48,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:41:48,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:41:51,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:53,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 01:41:54,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:41:54,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:41:54,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:41:55,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:58,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:42:01,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 01:42:02,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 01:42:04,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:42:04,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:05,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:42:07,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:42:10,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:42:11,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:42:11,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:42:13,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 01:42:16,453 INFO [train.py:1039] (2/4) Epoch 16, batch 3750, loss[loss=0.179, simple_loss=0.2742, pruned_loss=0.04188, over 24560.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2607, pruned_loss=0.05617, over 4723570.79 frames. ], batch size: 71, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:42:16,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:42:19,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:42:19,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=556213.3333333334, ans=0.0 2023-09-30 01:42:21,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 01:42:21,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:42:22,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:23,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=556213.3333333334, ans=0.0 2023-09-30 01:42:24,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:24,963 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.52 vs. limit=15.0 2023-09-30 01:42:25,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:42:26,482 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.70 vs. limit=15.0 2023-09-30 01:42:30,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:33,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.97 vs. limit=15.0 2023-09-30 01:42:36,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:42:36,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:42:37,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=556280.0, ans=0.125 2023-09-30 01:42:39,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:42:44,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:42:44,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 01:42:46,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:42:47,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:42:49,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:52,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=556346.6666666666, ans=0.0 2023-09-30 01:42:53,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 01:42:56,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 01:42:59,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:43:00,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:43:00,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=556346.6666666666, ans=0.125 2023-09-30 01:43:01,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:06,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:08,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:43:13,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 01:43:16,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:19,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=556413.3333333334, ans=0.125 2023-09-30 01:43:20,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:43:21,369 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.70 vs. limit=22.5 2023-09-30 01:43:22,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:43:25,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:43:28,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:43:30,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:43:31,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:43:34,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:43:36,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:43:37,528 INFO [train.py:1039] (2/4) Epoch 16, batch 3800, loss[loss=0.1509, simple_loss=0.2267, pruned_loss=0.03754, over 24285.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2611, pruned_loss=0.05642, over 4729671.25 frames. ], batch size: 56, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:43:41,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=556546.6666666666, ans=0.125 2023-09-30 01:43:43,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:43:49,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:49,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:43:50,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 01:43:50,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=556546.6666666666, ans=0.2 2023-09-30 01:43:53,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:53,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:43:55,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:43:56,511 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.836e+02 2.013e+02 2.222e+02 3.108e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 01:43:56,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 01:43:56,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:58,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:43:58,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:59,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:43:59,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:01,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 01:44:04,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 01:44:06,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:44:07,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:10,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:44:12,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:44:12,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:44:12,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:14,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=556680.0, ans=0.0 2023-09-30 01:44:17,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:17,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:19,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=556680.0, ans=0.125 2023-09-30 01:44:23,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:44:23,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 01:44:25,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:32,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:44:36,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=556746.6666666666, ans=0.125 2023-09-30 01:44:39,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:44:40,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 01:44:43,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 01:44:45,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:45,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:46,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:48,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 01:44:53,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 01:44:53,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 01:44:53,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:55,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:45:01,140 INFO [train.py:1039] (2/4) Epoch 16, batch 3850, loss[loss=0.184, simple_loss=0.2622, pruned_loss=0.05294, over 24391.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2603, pruned_loss=0.05592, over 4731426.64 frames. ], batch size: 77, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:45:02,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:45:02,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:45:08,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:45:10,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 01:45:10,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:45:12,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:15,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:45:18,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:19,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:45:21,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 01:45:28,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:30,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:30,410 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=556946.6666666666, ans=0.2 2023-09-30 01:45:30,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=556946.6666666666, ans=0.125 2023-09-30 01:45:34,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:35,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:45:38,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:40,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:45:40,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:40,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:45:41,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:44,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:45:44,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 01:45:45,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 01:45:46,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:46,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:49,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 01:45:52,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 01:45:54,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:56,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 01:45:59,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:46:05,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:07,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:46:10,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:10,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 01:46:12,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=557146.6666666666, ans=0.2 2023-09-30 01:46:13,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 01:46:15,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:16,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:19,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:46:19,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:46:19,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:46:21,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 01:46:22,638 INFO [train.py:1039] (2/4) Epoch 16, batch 3900, loss[loss=0.1809, simple_loss=0.2505, pruned_loss=0.05568, over 24432.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2586, pruned_loss=0.05511, over 4730569.82 frames. ], batch size: 58, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:46:22,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:46:24,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 01:46:25,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:25,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:27,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:46:27,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:27,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:46:28,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:28,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:29,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:30,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 01:46:30,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:31,030 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.68 vs. limit=22.5 2023-09-30 01:46:34,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:36,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:38,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:46:38,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:42,008 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.908e+02 2.170e+02 2.558e+02 5.090e+02, threshold=4.341e+02, percent-clipped=1.0 2023-09-30 01:46:42,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:42,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:42,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:46:45,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 01:46:45,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:46:46,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 01:46:46,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:48,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 01:46:48,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 01:46:53,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:46:53,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:53,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:46:55,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:46:58,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:47:01,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:47:03,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:47:03,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:05,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:47:12,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:12,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:47:19,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:47:21,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:47:27,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=557413.3333333334, ans=0.0 2023-09-30 01:47:33,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:47:33,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=557480.0, ans=0.0 2023-09-30 01:47:34,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:34,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=557480.0, ans=0.125 2023-09-30 01:47:36,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 01:47:36,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 01:47:36,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=557480.0, ans=0.125 2023-09-30 01:47:37,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:40,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 01:47:42,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:43,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 01:47:46,625 INFO [train.py:1039] (2/4) Epoch 16, batch 3950, loss[loss=0.1711, simple_loss=0.2381, pruned_loss=0.05207, over 23345.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2587, pruned_loss=0.055, over 4736566.59 frames. ], batch size: 120, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:47:52,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:53,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 01:47:54,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:47:58,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:48:00,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:48:05,147 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 01:48:06,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:06,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 01:48:06,781 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 01:48:08,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:09,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:09,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:48:09,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:11,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 01:48:13,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.whiten.whitening_limit, batch_count=557613.3333333334, ans=12.0 2023-09-30 01:48:14,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:48:16,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:16,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:48:17,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:48:17,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=557613.3333333334, ans=0.125 2023-09-30 01:48:18,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:48:20,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=557680.0, ans=0.0 2023-09-30 01:48:26,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=557680.0, ans=0.05 2023-09-30 01:48:30,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=557680.0, ans=0.1 2023-09-30 01:48:31,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:48:31,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:48:35,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=557746.6666666666, ans=22.5 2023-09-30 01:48:36,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 01:48:43,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 01:48:43,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 01:48:43,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:48:46,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:48:53,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:48:53,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:48:53,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:53,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:48:55,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 01:49:00,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:49:00,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:49:07,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 01:49:10,292 INFO [train.py:1039] (2/4) Epoch 16, batch 4000, loss[loss=0.1909, simple_loss=0.274, pruned_loss=0.05392, over 24041.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2592, pruned_loss=0.05491, over 4741923.62 frames. ], batch size: 80, lr: 6.41e-03, grad_scale: 32.0 2023-09-30 01:49:12,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=557880.0, ans=0.0 2023-09-30 01:49:15,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:21,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:27,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:27,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=557946.6666666666, ans=0.09899494936611666 2023-09-30 01:49:28,509 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.918e+02 2.123e+02 2.513e+02 3.159e+02, threshold=4.246e+02, percent-clipped=0.0 2023-09-30 01:49:28,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:49:28,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:28,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 01:49:30,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:49:31,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 01:49:31,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:49:31,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 01:49:33,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:37,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:49:37,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:49:37,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:49:37,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:37,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:49:40,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:49:40,184 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 01:49:40,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:49:42,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:49:43,868 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 01:49:45,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:49:45,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:49:50,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=558013.3333333334, ans=0.0 2023-09-30 01:49:51,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 01:49:53,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:55,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:49:57,471 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 01:49:59,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:49:59,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 01:49:59,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:00,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:02,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:50:03,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:50:03,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:50:03,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:50:06,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 01:50:07,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:10,276 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 01:50:13,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=558080.0, ans=0.1 2023-09-30 01:50:15,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:50:15,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=558146.6666666666, ans=0.0 2023-09-30 01:50:18,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:50:20,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:50:21,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:22,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:50:23,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:28,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:31,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:50:31,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 01:50:33,777 INFO [train.py:1039] (2/4) Epoch 16, batch 4050, loss[loss=0.2204, simple_loss=0.2761, pruned_loss=0.08235, over 22792.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2594, pruned_loss=0.05517, over 4742241.65 frames. ], batch size: 322, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:50:35,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:50:35,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:50:36,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:50:37,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:37,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=558213.3333333334, ans=0.125 2023-09-30 01:50:38,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:43,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:46,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:50:48,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:50:49,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:50:51,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:53,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:55,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:59,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 01:51:03,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 01:51:03,262 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 01:51:03,577 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=558280.0, ans=0.2 2023-09-30 01:51:06,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:51:10,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=558346.6666666666, ans=0.125 2023-09-30 01:51:11,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=558346.6666666666, ans=0.0 2023-09-30 01:51:14,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 01:51:15,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:19,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:22,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:51:23,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:51:23,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:25,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:51:25,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=558413.3333333334, ans=0.125 2023-09-30 01:51:30,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 01:51:30,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:51:32,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:33,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 01:51:37,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=558480.0, ans=0.2 2023-09-30 01:51:40,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:46,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=558480.0, ans=0.0 2023-09-30 01:51:47,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 01:51:48,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:48,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:51:50,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 01:51:50,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 01:51:50,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:51:54,816 INFO [train.py:1039] (2/4) Epoch 16, batch 4100, loss[loss=0.1888, simple_loss=0.2706, pruned_loss=0.05351, over 24374.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2598, pruned_loss=0.05505, over 4757906.74 frames. ], batch size: 77, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:51:54,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:51:55,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:51:55,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:51:58,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=558546.6666666666, ans=0.125 2023-09-30 01:52:03,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 01:52:03,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=558546.6666666666, ans=0.1 2023-09-30 01:52:06,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 01:52:07,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 01:52:09,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 01:52:09,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:11,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:52:13,050 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 01:52:16,681 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.940e+02 2.145e+02 2.448e+02 4.292e+02, threshold=4.289e+02, percent-clipped=1.0 2023-09-30 01:52:16,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:18,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:52:18,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:20,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:52:23,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:52:23,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:23,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:52:25,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 01:52:25,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:25,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:52:25,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:25,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:52:27,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 01:52:30,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:33,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 01:52:35,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:52:36,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:36,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 01:52:39,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:52:39,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:52:39,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:52:41,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 01:52:43,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:52:43,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:52:46,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 01:52:46,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:48,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:52:50,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:52,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=558746.6666666666, ans=0.09899494936611666 2023-09-30 01:52:56,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:52:59,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:00,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:53:10,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:10,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:53:12,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=558813.3333333334, ans=0.1 2023-09-30 01:53:13,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:16,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:53:18,259 INFO [train.py:1039] (2/4) Epoch 16, batch 4150, loss[loss=0.1762, simple_loss=0.2358, pruned_loss=0.05832, over 23440.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2596, pruned_loss=0.05523, over 4756162.50 frames. ], batch size: 285, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:53:20,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:53:21,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:53:21,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:53:21,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:25,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 01:53:25,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:25,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 01:53:27,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 01:53:27,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 01:53:27,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=558880.0, ans=0.2 2023-09-30 01:53:29,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:33,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:53:33,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:37,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:53:37,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=558946.6666666666, ans=0.0 2023-09-30 01:53:39,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:53:39,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:53:41,800 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.19 vs. limit=15.0 2023-09-30 01:53:42,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:53:42,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:42,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:53:49,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:52,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:53:54,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 01:53:55,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 01:53:55,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:53:57,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 01:53:57,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:53:57,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:01,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:02,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:07,400 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:54:08,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 01:54:10,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:11,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:13,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 01:54:13,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:54:15,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 01:54:15,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=559080.0, ans=0.125 2023-09-30 01:54:19,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:54:20,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:22,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:22,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 01:54:22,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:22,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:54:23,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:54:26,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 01:54:26,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:26,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:54:26,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:54:28,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 01:54:28,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:28,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:54:30,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:54:30,355 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:54:32,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:32,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 01:54:32,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:38,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:54:40,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 01:54:41,593 INFO [train.py:1039] (2/4) Epoch 16, batch 4200, loss[loss=0.1757, simple_loss=0.2648, pruned_loss=0.04329, over 24459.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2586, pruned_loss=0.05523, over 4736312.48 frames. ], batch size: 69, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:54:41,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:54:43,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:54:44,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:54:46,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:46,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:49,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 01:54:53,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 01:54:53,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:54,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:58,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:55:02,994 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.901e+02 2.194e+02 2.609e+02 4.040e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-30 01:55:03,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:55:03,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:04,078 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.80 vs. limit=10.0 2023-09-30 01:55:05,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:05,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 01:55:05,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:55:06,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:08,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:55:08,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:55:09,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:55:11,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 01:55:11,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:13,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=559346.6666666666, ans=10.0 2023-09-30 01:55:16,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:55:16,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:55:19,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:55:19,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:55:21,157 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=559346.6666666666, ans=0.0 2023-09-30 01:55:22,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:55:22,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 01:55:24,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:24,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:55:26,059 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=559346.6666666666, ans=0.0 2023-09-30 01:55:27,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=559346.6666666666, ans=0.0 2023-09-30 01:55:30,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:55:32,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:39,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:55:42,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 01:55:45,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:50,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:55:52,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:55:53,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 01:55:57,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:56:01,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:56:01,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:56:03,858 INFO [train.py:1039] (2/4) Epoch 16, batch 4250, loss[loss=0.1934, simple_loss=0.2698, pruned_loss=0.05846, over 24443.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2571, pruned_loss=0.05485, over 4727393.14 frames. ], batch size: 63, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:56:04,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:05,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=559546.6666666666, ans=0.0 2023-09-30 01:56:09,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:56:09,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 01:56:09,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:56:09,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=559546.6666666666, ans=0.0 2023-09-30 01:56:13,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:14,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=559546.6666666666, ans=0.0 2023-09-30 01:56:15,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=559546.6666666666, ans=0.1 2023-09-30 01:56:17,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:21,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:21,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:23,702 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=559613.3333333334, ans=0.125 2023-09-30 01:56:24,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:56:24,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:56:26,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:27,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:28,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=559613.3333333334, ans=0.0 2023-09-30 01:56:28,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=559613.3333333334, ans=0.125 2023-09-30 01:56:30,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:31,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:56:33,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:34,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 01:56:40,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 01:56:40,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:42,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:42,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:43,193 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.85 vs. limit=22.5 2023-09-30 01:56:43,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:56:43,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:43,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:44,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=559680.0, ans=0.0 2023-09-30 01:56:45,470 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=559680.0, ans=0.1 2023-09-30 01:56:48,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:56:49,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:56:53,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:56:54,177 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=6.76 vs. limit=15.0 2023-09-30 01:56:54,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:56,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 01:56:56,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:56:56,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 01:56:57,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:57:00,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:57:02,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:02,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:57:04,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 01:57:06,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:57:06,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:57:09,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=559813.3333333334, ans=0.0 2023-09-30 01:57:10,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:14,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:16,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:57:19,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:57:21,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:21,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:57:21,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:57:21,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 01:57:24,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:26,116 INFO [train.py:1039] (2/4) Epoch 16, batch 4300, loss[loss=0.1753, simple_loss=0.2563, pruned_loss=0.04717, over 23320.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2568, pruned_loss=0.05435, over 4732490.21 frames. ], batch size: 93, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:57:26,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=559880.0, ans=0.0 2023-09-30 01:57:29,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:30,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:57:33,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:42,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:43,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 01:57:43,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:57:45,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:57:45,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:57:47,239 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.838e+02 2.077e+02 2.423e+02 4.089e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 01:57:47,358 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 01:57:49,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:57:51,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:57:59,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 01:57:59,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:57:59,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 01:58:02,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:58:03,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:58:07,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:58:07,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:58:07,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:58:09,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:09,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:58:09,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 01:58:10,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 01:58:12,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:58:16,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:16,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:58:17,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:17,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:17,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 01:58:17,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 01:58:18,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 01:58:20,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:20,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 01:58:20,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 01:58:22,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=560080.0, ans=0.1 2023-09-30 01:58:24,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.81 vs. limit=15.0 2023-09-30 01:58:25,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:27,415 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 01:58:27,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:58:29,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:29,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:32,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 01:58:33,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:58:33,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:35,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:58:35,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:36,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:58:38,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:58:40,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=560146.6666666666, ans=0.125 2023-09-30 01:58:41,121 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.64 vs. limit=22.5 2023-09-30 01:58:41,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:41,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:43,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:45,086 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=560146.6666666666, ans=0.1 2023-09-30 01:58:47,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 01:58:49,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:58:50,639 INFO [train.py:1039] (2/4) Epoch 16, batch 4350, loss[loss=0.1814, simple_loss=0.2588, pruned_loss=0.05203, over 24474.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2577, pruned_loss=0.05448, over 4741712.99 frames. ], batch size: 58, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:58:54,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:56,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:59:00,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:59:00,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:59:06,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:59:07,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=560280.0, ans=0.125 2023-09-30 01:59:10,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:59:13,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:59:13,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:16,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:59:20,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:59:22,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:59:25,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=560346.6666666666, ans=0.1 2023-09-30 01:59:27,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 01:59:27,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:28,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:31,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=560346.6666666666, ans=15.0 2023-09-30 01:59:36,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:38,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=560346.6666666666, ans=0.0 2023-09-30 01:59:39,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 01:59:41,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:43,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:59:48,088 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 01:59:51,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:51,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:59:52,777 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 01:59:52,896 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 01:59:52,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:52,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:54,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:59:55,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:56,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:56,144 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:56,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=560480.0, ans=0.1 2023-09-30 01:59:59,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 01:59:59,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:59,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:59,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:00,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 02:00:02,417 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 02:00:02,425 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 02:00:02,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 02:00:07,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:00:07,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:00:07,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:09,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:00:11,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 02:00:13,459 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 02:00:13,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=560546.6666666666, ans=0.0 2023-09-30 02:00:14,776 INFO [train.py:1039] (2/4) Epoch 16, batch 4400, loss[loss=0.1675, simple_loss=0.2355, pruned_loss=0.04972, over 23660.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2583, pruned_loss=0.05445, over 4748172.17 frames. ], batch size: 149, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:00:14,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:19,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:19,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:22,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:00:24,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 02:00:24,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 02:00:24,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 02:00:24,559 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 02:00:26,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:00:26,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:27,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 02:00:29,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:30,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:30,881 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 02:00:35,157 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.842e+02 2.058e+02 2.254e+02 3.655e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:00:35,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:35,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 02:00:36,784 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 02:00:38,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 02:00:40,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 02:00:40,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 02:00:40,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:41,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:41,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:43,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:00:45,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 02:00:45,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 02:00:47,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:49,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:00:49,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:51,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:51,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=560680.0, ans=0.5 2023-09-30 02:00:53,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:53,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 02:00:53,228 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 02:00:57,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:04,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:01:07,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 02:01:10,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:01:13,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:16,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:01:16,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 02:01:18,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:01:18,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:01:18,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:01:18,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:01:24,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 02:01:28,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 02:01:29,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 02:01:29,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:01:29,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 02:01:31,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:01:34,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:01:36,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 02:01:37,440 INFO [train.py:1039] (2/4) Epoch 16, batch 4450, loss[loss=0.1899, simple_loss=0.2601, pruned_loss=0.05981, over 23371.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2591, pruned_loss=0.05471, over 4751193.49 frames. ], batch size: 285, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:01:40,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:41,447 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.98 vs. limit=22.5 2023-09-30 02:01:43,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:43,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=560880.0, ans=0.2 2023-09-30 02:01:45,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:01:51,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:01:51,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:01:57,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:00,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:02:02,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:02:02,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:02,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 02:02:02,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:04,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:05,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:05,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:02:09,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:02:13,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:15,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:15,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=561013.3333333334, ans=0.04949747468305833 2023-09-30 02:02:18,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:02:23,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:02:24,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 02:02:24,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 02:02:24,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:02:25,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=561080.0, ans=0.125 2023-09-30 02:02:26,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:28,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 02:02:32,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:02:37,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:37,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 02:02:37,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:37,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:37,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:02:37,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:39,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:39,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=561080.0, ans=0.125 2023-09-30 02:02:43,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:02:44,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 02:02:46,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:02:47,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:50,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:51,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=561146.6666666666, ans=0.0 2023-09-30 02:02:52,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:52,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:02:55,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:02:58,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 02:03:00,036 INFO [train.py:1039] (2/4) Epoch 16, batch 4500, loss[loss=0.1871, simple_loss=0.2487, pruned_loss=0.06272, over 23444.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.26, pruned_loss=0.05589, over 4720292.25 frames. ], batch size: 134, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:03:00,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:03:04,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:05,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 02:03:05,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 02:03:08,162 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.80 vs. limit=8.0 2023-09-30 02:03:08,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:14,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:03:15,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:15,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:03:17,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:03:17,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:18,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:19,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=561280.0, ans=0.1 2023-09-30 02:03:23,484 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.923e+02 2.201e+02 2.757e+02 3.678e+02, threshold=4.403e+02, percent-clipped=0.0 2023-09-30 02:03:27,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=561280.0, ans=0.0 2023-09-30 02:03:29,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:31,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:03:33,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:03:33,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:03:35,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:03:39,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=561346.6666666666, ans=0.125 2023-09-30 02:03:43,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:03:47,880 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-09-30 02:03:48,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:03:52,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:03:55,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:03:57,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 02:03:58,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:03:58,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:01,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:01,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:04:03,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:04:03,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 02:04:03,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:04:03,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:03,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=561413.3333333334, ans=0.1 2023-09-30 02:04:08,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:04:08,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:04:11,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:14,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:04:14,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:04:15,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 02:04:17,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 02:04:17,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 02:04:17,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=561480.0, ans=0.125 2023-09-30 02:04:20,151 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.35 vs. limit=15.0 2023-09-30 02:04:22,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 02:04:24,009 INFO [train.py:1039] (2/4) Epoch 16, batch 4550, loss[loss=0.1815, simple_loss=0.2452, pruned_loss=0.05887, over 23860.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2589, pruned_loss=0.05563, over 4709383.60 frames. ], batch size: 195, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:04:26,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 02:04:27,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:32,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:32,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:35,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:38,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:04:40,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:40,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=561613.3333333334, ans=0.1 2023-09-30 02:04:42,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:04:42,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:04:42,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:45,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:45,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:45,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=561613.3333333334, ans=0.1 2023-09-30 02:04:50,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:04:54,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 02:04:54,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 02:04:55,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:04:57,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 02:05:00,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 02:05:01,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:04,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=561680.0, ans=0.125 2023-09-30 02:05:05,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 02:05:06,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:05:11,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:05:13,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 02:05:16,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:18,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:19,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:19,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:22,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 02:05:22,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 02:05:22,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:05:22,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 02:05:25,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 02:05:25,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:27,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:27,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:28,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:28,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:05:31,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:05:31,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 02:05:33,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=561813.3333333334, ans=0.0 2023-09-30 02:05:34,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:34,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:05:34,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 02:05:34,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:05:37,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 02:05:40,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:05:40,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:05:43,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:05:43,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:43,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:05:44,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=561880.0, ans=0.07 2023-09-30 02:05:46,088 INFO [train.py:1039] (2/4) Epoch 16, batch 4600, loss[loss=0.179, simple_loss=0.2657, pruned_loss=0.04618, over 24628.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2577, pruned_loss=0.05496, over 4708845.46 frames. ], batch size: 68, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:05:46,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:05:47,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:05:49,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:51,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:55,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:05:55,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:05:55,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:05:56,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 02:05:58,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:06:02,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:06:02,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:02,464 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:06:05,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:09,848 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.941e+02 2.143e+02 2.418e+02 3.430e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 02:06:12,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 02:06:13,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:15,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:18,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:06:18,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:25,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 02:06:25,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:06:25,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:06:33,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:34,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:06:35,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:06:40,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 02:06:40,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:06:45,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:47,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:06:50,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:50,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 02:06:50,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:51,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 02:06:51,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:53,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:54,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:54,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:58,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:58,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 02:06:58,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 02:06:58,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 02:06:59,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:01,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:02,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:03,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:07:05,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=562146.6666666666, ans=0.1 2023-09-30 02:07:09,713 INFO [train.py:1039] (2/4) Epoch 16, batch 4650, loss[loss=0.1946, simple_loss=0.2636, pruned_loss=0.06277, over 23319.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2571, pruned_loss=0.0542, over 4721848.82 frames. ], batch size: 119, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:07:13,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:07:15,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=562213.3333333334, ans=0.09899494936611666 2023-09-30 02:07:16,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:16,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:16,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:07:16,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:16,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:20,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:21,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 02:07:26,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:07:27,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 02:07:29,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:29,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 02:07:29,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:07:30,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 02:07:30,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 02:07:30,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:33,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:07:34,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=562280.0, ans=0.0 2023-09-30 02:07:36,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:07:38,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:38,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 02:07:41,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:44,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 02:07:46,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:46,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:07:48,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 02:07:48,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:07:48,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=562346.6666666666, ans=0.5 2023-09-30 02:07:51,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:07:55,443 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.92 vs. limit=22.5 2023-09-30 02:07:56,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:01,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:04,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:05,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:06,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:08:10,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 02:08:11,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 02:08:13,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 02:08:13,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 02:08:14,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:18,864 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.62 vs. limit=15.0 2023-09-30 02:08:21,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:08:21,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:23,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 02:08:23,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:24,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:24,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:08:26,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:08:28,255 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.23 vs. limit=15.0 2023-09-30 02:08:30,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:08:30,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:32,595 INFO [train.py:1039] (2/4) Epoch 16, batch 4700, loss[loss=0.1611, simple_loss=0.2436, pruned_loss=0.03925, over 24449.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2578, pruned_loss=0.05429, over 4729385.77 frames. ], batch size: 63, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:08:32,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:35,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:35,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:08:35,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:08:37,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:08:38,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:08:39,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 02:08:49,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:50,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:50,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:08:52,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:53,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:08:55,745 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.890e+02 2.173e+02 2.583e+02 3.915e+02, threshold=4.346e+02, percent-clipped=0.0 2023-09-30 02:08:59,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 02:08:59,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 02:09:02,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:02,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:09:03,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:09:04,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=562680.0, ans=0.1 2023-09-30 02:09:07,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:15,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:09:16,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:09:17,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=562680.0, ans=0.125 2023-09-30 02:09:18,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:09:24,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 02:09:26,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:09:27,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:31,481 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.07 vs. limit=15.0 2023-09-30 02:09:34,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 02:09:35,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:09:39,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:09:39,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 02:09:40,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:40,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:44,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:44,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:09:44,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 02:09:47,318 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 02:09:48,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:51,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 02:09:53,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:54,805 INFO [train.py:1039] (2/4) Epoch 16, batch 4750, loss[loss=0.1865, simple_loss=0.2533, pruned_loss=0.05987, over 23642.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2588, pruned_loss=0.05415, over 4735602.76 frames. ], batch size: 256, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:09:55,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 02:09:58,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:09:58,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:02,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:04,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:10:04,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 02:10:05,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:08,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 02:10:10,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:10:10,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:10:10,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.97 vs. limit=15.0 2023-09-30 02:10:11,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:18,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 02:10:23,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:10:26,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 02:10:27,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:29,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:31,420 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 02:10:31,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 02:10:38,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 02:10:39,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:42,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:10:44,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:10:44,435 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 02:10:44,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:10:47,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:10:49,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:10:50,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 02:10:50,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 02:10:53,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:53,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:10:54,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:54,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:10:56,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 02:10:56,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=563080.0, ans=0.125 2023-09-30 02:10:59,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 02:11:02,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:07,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:11:07,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 02:11:07,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:08,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:11,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:11:12,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:12,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:11:16,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:17,295 INFO [train.py:1039] (2/4) Epoch 16, batch 4800, loss[loss=0.1683, simple_loss=0.2515, pruned_loss=0.04253, over 24620.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2596, pruned_loss=0.05435, over 4743256.33 frames. ], batch size: 65, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:11:17,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 02:11:17,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 02:11:18,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 02:11:22,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:11:22,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:23,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 02:11:25,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=563213.3333333334, ans=0.125 2023-09-30 02:11:28,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:28,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:33,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:11:34,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:35,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:36,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 02:11:38,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:38,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:11:40,585 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.887e+02 2.070e+02 2.375e+02 3.869e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 02:11:42,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:11:46,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:11:47,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:47,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:11:48,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:48,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:11:48,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=563280.0, ans=0.1 2023-09-30 02:11:49,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:50,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:52,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=563346.6666666666, ans=0.07 2023-09-30 02:11:54,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:55,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:11:58,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:11:59,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:02,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 02:12:02,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 02:12:04,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:04,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:12:05,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:12:05,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:05,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:12:07,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:12:07,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:10,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=563413.3333333334, ans=0.1 2023-09-30 02:12:12,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:14,634 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.84 vs. limit=15.0 2023-09-30 02:12:16,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:17,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:22,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 02:12:22,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:23,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:24,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:12:25,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:29,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:30,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:12:30,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:32,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:12:33,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:12:33,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:12:37,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:37,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:37,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:40,153 INFO [train.py:1039] (2/4) Epoch 16, batch 4850, loss[loss=0.1807, simple_loss=0.2667, pruned_loss=0.04733, over 24322.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2602, pruned_loss=0.05518, over 4727982.31 frames. ], batch size: 74, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:12:40,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 02:12:41,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 02:12:41,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:41,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:43,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:12:43,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:46,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:53,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 02:12:55,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:13:00,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:01,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:13:01,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:04,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:13:05,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=563613.3333333334, ans=0.035 2023-09-30 02:13:06,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:13:06,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:13:06,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 02:13:11,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:13:14,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:13:14,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:13:16,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:13:16,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 02:13:19,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:19,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:24,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:24,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 02:13:25,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.23 vs. limit=10.0 2023-09-30 02:13:25,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 02:13:25,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:13:28,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=563746.6666666666, ans=0.125 2023-09-30 02:13:34,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:13:34,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 02:13:37,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:13:37,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:13:39,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:13:40,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 02:13:40,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:42,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 02:13:42,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:42,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:13:44,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 02:13:53,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:59,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:13:59,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:02,880 INFO [train.py:1039] (2/4) Epoch 16, batch 4900, loss[loss=0.179, simple_loss=0.2681, pruned_loss=0.0449, over 24716.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2588, pruned_loss=0.05483, over 4737928.95 frames. ], batch size: 73, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:14:03,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=563880.0, ans=0.1 2023-09-30 02:14:05,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 02:14:05,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:14:06,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=563880.0, ans=0.0 2023-09-30 02:14:10,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:11,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.46 vs. limit=15.0 2023-09-30 02:14:12,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:12,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:14:15,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 02:14:15,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=563880.0, ans=10.0 2023-09-30 02:14:20,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 02:14:20,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=563946.6666666666, ans=0.05 2023-09-30 02:14:22,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=563946.6666666666, ans=0.125 2023-09-30 02:14:23,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=563946.6666666666, ans=0.0 2023-09-30 02:14:24,977 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.946e+02 2.133e+02 2.467e+02 3.436e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 02:14:27,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 02:14:27,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 02:14:28,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:28,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:28,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:14:28,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:28,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:14:30,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 02:14:34,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 02:14:36,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:14:38,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:14:39,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:39,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=564013.3333333334, ans=0.0 2023-09-30 02:14:41,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:14:42,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:42,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:42,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 02:14:44,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:14:44,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:46,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 02:14:46,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 02:14:49,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 02:14:50,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:14:52,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:14:52,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:14:52,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:53,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:14:53,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:14:54,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 02:14:56,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:57,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=564080.0, ans=0.125 2023-09-30 02:14:59,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:15:02,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:15:03,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=564080.0, ans=0.2 2023-09-30 02:15:05,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 02:15:05,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.68 vs. limit=15.0 2023-09-30 02:15:06,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:15:06,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:15:08,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 02:15:13,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:14,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:15:16,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 02:15:16,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:16,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:15:19,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:20,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=564146.6666666666, ans=0.09899494936611666 2023-09-30 02:15:23,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:23,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:15:24,965 INFO [train.py:1039] (2/4) Epoch 16, batch 4950, loss[loss=0.1829, simple_loss=0.2506, pruned_loss=0.05765, over 23464.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2581, pruned_loss=0.05454, over 4732696.79 frames. ], batch size: 119, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:15:25,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:25,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 02:15:26,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:15:30,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:30,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:35,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 02:15:35,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 02:15:35,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:15:37,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 02:15:37,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:37,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:15:38,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:15:38,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:15:41,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=564280.0, ans=0.2 2023-09-30 02:15:42,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:42,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:15:44,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:15:46,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:47,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:47,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:50,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:15:55,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:56,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:16:00,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:00,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:01,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:16:02,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 02:16:02,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.68 vs. limit=6.0 2023-09-30 02:16:03,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 02:16:03,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=564346.6666666666, ans=0.125 2023-09-30 02:16:06,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:08,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:16:08,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:16:10,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:16:10,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:16:11,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:16:13,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:17,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:16:17,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=564413.3333333334, ans=0.125 2023-09-30 02:16:20,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:16:21,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:23,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:23,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 02:16:24,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:16:26,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:16:30,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:16:32,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:16:32,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:16:32,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:16:34,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:16:36,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:16:37,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:16:37,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:39,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 02:16:45,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:16:47,452 INFO [train.py:1039] (2/4) Epoch 16, batch 5000, loss[loss=0.2008, simple_loss=0.2657, pruned_loss=0.06796, over 23544.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2574, pruned_loss=0.05372, over 4741468.93 frames. ], batch size: 149, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:16:50,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 02:16:50,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:16:56,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:56,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:16:59,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 02:16:59,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 02:17:02,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:04,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 02:17:04,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:17:04,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=564613.3333333334, ans=0.125 2023-09-30 02:17:05,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:17:07,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 02:17:07,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:08,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:08,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 02:17:08,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:09,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:10,777 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.922e+02 2.167e+02 2.587e+02 4.159e+02, threshold=4.333e+02, percent-clipped=0.0 2023-09-30 02:17:12,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 02:17:12,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 02:17:13,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:17:13,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 02:17:13,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:17:14,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:15,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:17:15,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 02:17:15,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 02:17:17,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 02:17:17,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:19,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:19,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 02:17:20,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:17:22,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:22,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:24,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:17:26,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 02:17:26,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:17:27,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:17:31,532 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 02:17:36,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:36,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:36,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:17:36,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=564746.6666666666, ans=15.0 2023-09-30 02:17:38,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=564746.6666666666, ans=0.09899494936611666 2023-09-30 02:17:40,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 02:17:40,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:40,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:42,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:17:44,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 02:17:46,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:49,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:49,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:54,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 02:18:02,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:07,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=564813.3333333334, ans=0.1 2023-09-30 02:18:09,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=564880.0, ans=0.1 2023-09-30 02:18:10,334 INFO [train.py:1039] (2/4) Epoch 16, batch 5050, loss[loss=0.1982, simple_loss=0.2805, pruned_loss=0.05799, over 24369.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2582, pruned_loss=0.05418, over 4735231.20 frames. ], batch size: 77, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:18:10,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:18:12,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:12,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:18:12,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:13,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:18:13,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:18:13,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:15,396 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=564880.0, ans=0.0 2023-09-30 02:18:18,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:18,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 02:18:20,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:18:23,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:24,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:18:24,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 02:18:26,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:26,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:18:30,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:18:31,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:18:31,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:18:41,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 02:18:43,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:18:44,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:18:44,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 02:18:44,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:18:44,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:46,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:46,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:18:46,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 02:18:47,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 02:18:47,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:52,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:18:54,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:55,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 02:18:57,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:19:00,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 02:19:01,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:19:01,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:19:04,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:04,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:19:05,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:07,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:19:08,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:09,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:19:09,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:19:10,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 02:19:11,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:19:13,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:19:17,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:19:17,146 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 02:19:17,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:19:18,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:20,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:20,149 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 02:19:23,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:23,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 02:19:23,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:26,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:26,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:28,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 02:19:28,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 02:19:31,359 INFO [train.py:1039] (2/4) Epoch 16, batch 5100, loss[loss=0.2065, simple_loss=0.2838, pruned_loss=0.0646, over 23574.00 frames. ], tot_loss[loss=0.184, simple_loss=0.259, pruned_loss=0.0545, over 4733829.71 frames. ], batch size: 85, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:19:31,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:31,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:19:31,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:19:34,708 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 02:19:38,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:42,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 02:19:42,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 02:19:44,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:45,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:48,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:50,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 02:19:50,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 02:19:54,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:54,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:19:55,996 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.868e+02 2.082e+02 2.336e+02 3.756e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-30 02:19:57,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:20:01,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 02:20:01,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:02,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=565280.0, ans=0.0 2023-09-30 02:20:02,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=565280.0, ans=0.0 2023-09-30 02:20:04,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:20:04,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 02:20:08,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:08,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:08,523 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:20:09,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 02:20:12,027 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 02:20:12,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:14,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 02:20:14,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 02:20:14,454 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:20:16,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:25,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:20:27,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 02:20:28,043 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.95 vs. limit=15.0 2023-09-30 02:20:28,824 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 02:20:28,837 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 02:20:29,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 02:20:29,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:32,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 02:20:35,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 02:20:36,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=565413.3333333334, ans=0.2 2023-09-30 02:20:37,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:20:39,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:20:39,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=565480.0, ans=0.1 2023-09-30 02:20:40,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 02:20:44,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:20:45,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 02:20:51,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:20:51,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:20:51,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:20:53,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:20:54,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:20:56,280 INFO [train.py:1039] (2/4) Epoch 16, batch 5150, loss[loss=0.1666, simple_loss=0.2509, pruned_loss=0.04112, over 24515.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2596, pruned_loss=0.05569, over 4715836.51 frames. ], batch size: 66, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:20:56,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:20:56,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 02:20:56,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 02:20:57,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 02:20:59,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:20:59,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 02:20:59,652 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=565546.6666666666, ans=0.125 2023-09-30 02:21:00,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:01,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:21:02,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:04,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:10,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:21:10,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 02:21:12,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:12,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:21:14,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=565613.3333333334, ans=0.0 2023-09-30 02:21:15,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:21:15,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:15,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:17,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:21:17,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:21:17,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 02:21:18,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:21:20,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:21:23,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:21:24,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 02:21:27,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:21:32,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:21:35,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 02:21:38,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:45,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:46,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:50,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:21:50,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:21:53,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 02:21:57,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=565746.6666666666, ans=0.0 2023-09-30 02:21:58,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:22:00,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:22:00,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:22:02,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:04,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:06,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 02:22:10,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:10,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:22:14,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:22:14,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:22:14,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:22:15,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:22:15,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:22:15,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:22:19,282 INFO [train.py:1039] (2/4) Epoch 16, batch 5200, loss[loss=0.1828, simple_loss=0.2605, pruned_loss=0.05251, over 24427.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2609, pruned_loss=0.05633, over 4705965.93 frames. ], batch size: 63, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:22:19,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:22:22,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:22:23,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:29,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 02:22:29,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:22:30,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:32,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:34,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:22:34,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:39,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 02:22:41,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:22:41,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:44,238 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 1.894e+02 2.058e+02 2.319e+02 3.515e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:22:44,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 02:22:45,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:22:46,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:22:47,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 02:22:49,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 02:22:52,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 02:22:53,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:53,742 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 02:22:53,753 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:55,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:55,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:22:55,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 02:22:57,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:58,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:02,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 02:23:02,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 02:23:04,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 02:23:09,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 02:23:11,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:23:16,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:23:16,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:18,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 02:23:19,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:19,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:23:19,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:19,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:23:22,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:22,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:23:26,414 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.17 vs. limit=15.0 2023-09-30 02:23:27,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:23:29,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:29,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:34,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:35,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 02:23:37,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:37,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:23:38,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:40,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:23:40,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:23:42,261 INFO [train.py:1039] (2/4) Epoch 16, batch 5250, loss[loss=0.1907, simple_loss=0.2641, pruned_loss=0.05864, over 23248.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2597, pruned_loss=0.05608, over 4705611.47 frames. ], batch size: 105, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:23:44,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=566213.3333333334, ans=0.2 2023-09-30 02:23:45,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:23:50,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:50,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:23:52,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:23:52,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.12 vs. limit=15.0 2023-09-30 02:23:58,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:58,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:24:01,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:24:03,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:24:05,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 02:24:05,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:24:08,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:24:33,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=566413.3333333334, ans=0.2 2023-09-30 02:24:37,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=566413.3333333334, ans=0.125 2023-09-30 02:24:50,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=566480.0, ans=0.125 2023-09-30 02:24:57,068 INFO [train.py:1039] (2/4) Epoch 16, batch 5300, loss[loss=0.1888, simple_loss=0.2508, pruned_loss=0.06337, over 23771.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2573, pruned_loss=0.05569, over 4692822.88 frames. ], batch size: 179, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:25:00,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=566546.6666666666, ans=0.09899494936611666 2023-09-30 02:25:01,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=566546.6666666666, ans=0.125 2023-09-30 02:25:01,646 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=566546.6666666666, ans=0.125 2023-09-30 02:25:08,391 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=566546.6666666666, ans=0.1 2023-09-30 02:25:09,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=566613.3333333334, ans=0.125 2023-09-30 02:25:12,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:25:12,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 02:25:12,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 02:25:12,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:12,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:12,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:12,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:12,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:12,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:13,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:13,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:25:14,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:25:14,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 02:25:14,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 02:25:14,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 02:25:14,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:25:14,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 02:25:14,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 02:25:14,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:15,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:15,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:15,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:15,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:25:16,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:16,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:16,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:16,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:16,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:16,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:25:16,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:16,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:25:17,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 02:25:17,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:18,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:18,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 02:25:18,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 02:25:18,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:25:18,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:18,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 02:25:19,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 02:25:19,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:19,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:25:19,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:20,116 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 02:25:20,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 02:25:20,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:25:20,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:20,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 02:25:20,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 02:25:20,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 02:25:21,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:29,927 INFO [train.py:1039] (2/4) Epoch 17, batch 0, loss[loss=0.1899, simple_loss=0.271, pruned_loss=0.05445, over 23972.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.271, pruned_loss=0.05445, over 23972.00 frames. ], batch size: 80, lr: 6.17e-03, grad_scale: 32.0 2023-09-30 02:25:29,927 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 02:25:40,609 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.6742, 1.6593, 3.6325, 3.3754], device='cuda:2') 2023-09-30 02:25:43,981 INFO [train.py:1071] (2/4) Epoch 17, validation: loss=0.3013, simple_loss=0.2697, pruned_loss=0.1665, over 1125622.00 frames. 2023-09-30 02:25:43,982 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 02:25:45,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 02:25:47,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:25:49,123 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.973e+02 2.191e+02 2.524e+02 3.767e+02, threshold=4.382e+02, percent-clipped=0.0 2023-09-30 02:25:49,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:25:56,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:56,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:25:56,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:58,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 02:26:00,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 02:26:01,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:02,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:05,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:05,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:06,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=566693.3333333334, ans=0.0 2023-09-30 02:26:07,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:26:07,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:09,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 02:26:10,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:17,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=566760.0, ans=0.125 2023-09-30 02:26:19,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:26:19,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:19,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=566760.0, ans=0.0 2023-09-30 02:26:22,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 02:26:26,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:26:26,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:26:27,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:32,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:26:35,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:40,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 02:26:45,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 02:26:45,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:26:45,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:46,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:26:47,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:48,818 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=566893.3333333334, ans=0.125 2023-09-30 02:26:49,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 02:26:52,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:53,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:58,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:02,182 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 02:27:03,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:27:07,330 INFO [train.py:1039] (2/4) Epoch 17, batch 50, loss[loss=0.167, simple_loss=0.244, pruned_loss=0.04498, over 24457.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2605, pruned_loss=0.05442, over 1078572.17 frames. ], batch size: 58, lr: 6.17e-03, grad_scale: 16.0 2023-09-30 02:27:07,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:10,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:10,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 02:27:10,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:27:12,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:27:12,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=566960.0, ans=0.2 2023-09-30 02:27:13,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:15,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:18,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:22,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 02:27:22,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:26,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=567026.6666666666, ans=0.1 2023-09-30 02:27:27,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:27:30,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 02:27:31,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 02:27:34,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:27:36,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:27:36,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:36,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:27:36,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=567026.6666666666, ans=0.1 2023-09-30 02:27:38,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:27:38,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:27:38,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:43,063 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.73 vs. limit=15.0 2023-09-30 02:27:46,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:27:47,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:48,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:27:48,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 02:27:52,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:27:53,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:27:53,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 02:27:53,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:55,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=567160.0, ans=0.05 2023-09-30 02:27:56,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 02:28:05,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:05,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:28:05,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:06,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:06,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:09,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 02:28:10,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 02:28:10,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=567160.0, ans=0.1 2023-09-30 02:28:12,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:12,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:13,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:28:13,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:28:15,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 02:28:16,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 02:28:18,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:28:18,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:18,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:28:19,099 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.08 vs. limit=22.5 2023-09-30 02:28:19,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 02:28:19,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 02:28:19,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:21,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:24,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:28:24,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:28:28,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:28:29,422 INFO [train.py:1039] (2/4) Epoch 17, batch 100, loss[loss=0.1864, simple_loss=0.2579, pruned_loss=0.05742, over 23577.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2599, pruned_loss=0.0553, over 1881424.90 frames. ], batch size: 256, lr: 6.16e-03, grad_scale: 16.0 2023-09-30 02:28:31,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:28:34,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:36,141 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.907e+02 2.184e+02 2.612e+02 4.946e+02, threshold=4.368e+02, percent-clipped=2.0 2023-09-30 02:28:36,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 02:28:36,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:39,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:28:39,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:41,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:41,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:41,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:42,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 02:28:44,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:28:44,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:45,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:45,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:49,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 02:28:51,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:51,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=567360.0, ans=0.07 2023-09-30 02:28:52,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:52,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=567360.0, ans=0.125 2023-09-30 02:28:52,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=567360.0, ans=0.2 2023-09-30 02:28:54,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:28:55,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:28:59,490 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 02:28:59,516 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 02:28:59,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:28:59,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:29:02,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:29:05,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:29:07,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,493 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 02:29:15,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:29:20,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:20,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:29:22,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:27,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:28,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:31,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:29:32,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:34,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:37,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:37,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:29:37,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:38,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 02:29:38,689 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 02:29:38,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:38,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:29:40,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:40,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:40,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:29:41,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:29:41,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:29:41,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:41,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:43,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:43,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:29:45,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:29:46,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:49,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:49,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:29:51,219 INFO [train.py:1039] (2/4) Epoch 17, batch 150, loss[loss=0.1881, simple_loss=0.2744, pruned_loss=0.0509, over 24467.00 frames. ], tot_loss[loss=0.186, simple_loss=0.26, pruned_loss=0.05598, over 2511351.34 frames. ], batch size: 66, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:29:51,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:54,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:54,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=567626.6666666666, ans=0.125 2023-09-30 02:29:55,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:57,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:58,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:00,274 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:30:00,975 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.26 vs. limit=15.0 2023-09-30 02:30:03,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 02:30:03,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 02:30:03,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 02:30:05,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:30:06,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:30:06,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=567693.3333333334, ans=0.1 2023-09-30 02:30:08,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:30:09,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:30:09,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:09,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:11,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:12,769 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 02:30:14,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:21,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:25,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:30:25,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 02:30:29,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:30:29,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:29,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:30:33,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:30:36,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:30:36,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:30:38,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:38,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=567826.6666666666, ans=0.125 2023-09-30 02:30:38,814 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.10 vs. limit=15.0 2023-09-30 02:30:39,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 02:30:46,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:47,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:30:48,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:30:48,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:30:51,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:52,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 02:30:52,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=567826.6666666666, ans=0.1 2023-09-30 02:30:55,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:30:57,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:30:59,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:02,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:31:02,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 02:31:02,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:31:03,826 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 02:31:04,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=567893.3333333334, ans=0.1 2023-09-30 02:31:06,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:09,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=567893.3333333334, ans=0.2 2023-09-30 02:31:12,017 INFO [train.py:1039] (2/4) Epoch 17, batch 200, loss[loss=0.1853, simple_loss=0.2497, pruned_loss=0.06051, over 23589.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2612, pruned_loss=0.05615, over 3006500.25 frames. ], batch size: 256, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:31:12,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:31:12,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:31:15,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 02:31:16,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:16,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=567960.0, ans=0.0 2023-09-30 02:31:18,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:19,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff2.min_abs, batch_count=567960.0, ans=0.1 2023-09-30 02:31:20,238 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.902e+02 2.115e+02 2.489e+02 3.841e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 02:31:22,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 02:31:23,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:31:23,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:25,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:28,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:31:28,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:28,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:35,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=568026.6666666666, ans=0.0 2023-09-30 02:31:48,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=568093.3333333334, ans=0.125 2023-09-30 02:31:50,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:31:51,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:31:53,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:31:53,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:31:55,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:31:55,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:31:56,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:57,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:31:58,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:58,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:00,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 02:32:00,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:32:00,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:02,535 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.16 vs. limit=15.0 2023-09-30 02:32:03,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:32:10,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:32:15,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:15,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:32:18,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=568226.6666666666, ans=0.1 2023-09-30 02:32:22,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:26,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 02:32:26,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:27,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:32:27,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:29,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:32:30,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 02:32:32,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:32:32,404 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 02:32:35,478 INFO [train.py:1039] (2/4) Epoch 17, batch 250, loss[loss=0.2096, simple_loss=0.287, pruned_loss=0.06613, over 23934.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2609, pruned_loss=0.05626, over 3384583.24 frames. ], batch size: 86, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:32:35,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:37,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:32:37,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=568293.3333333334, ans=0.125 2023-09-30 02:32:39,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:40,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:42,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:32:43,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:45,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:32:48,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:33:00,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:02,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:03,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:33:10,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:33:12,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:33:14,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:33:14,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:15,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:33:15,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:33:15,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:18,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:33:23,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 02:33:23,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:23,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=568493.3333333334, ans=0.2 2023-09-30 02:33:24,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:33:24,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:33:24,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:33:26,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:33:27,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:33:27,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:33:31,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:31,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:33:32,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:35,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:33:40,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:43,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:33:50,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:51,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:33:54,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 02:33:57,777 INFO [train.py:1039] (2/4) Epoch 17, batch 300, loss[loss=0.1892, simple_loss=0.2641, pruned_loss=0.05722, over 23309.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2598, pruned_loss=0.056, over 3680257.60 frames. ], batch size: 105, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:33:57,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:59,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:34:00,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 02:34:00,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:34:02,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:34:02,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 02:34:05,239 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.942e+02 2.251e+02 2.659e+02 4.378e+02, threshold=4.502e+02, percent-clipped=1.0 2023-09-30 02:34:07,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:34:07,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:10,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:34:11,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 02:34:14,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:34:14,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:34:14,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 02:34:14,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:14,908 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=568693.3333333334, ans=0.0 2023-09-30 02:34:17,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:34:22,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:34:24,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 02:34:27,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 02:34:27,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:29,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=568760.0, ans=0.09899494936611666 2023-09-30 02:34:30,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:33,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:33,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 02:34:33,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:34:35,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:34:36,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:34:37,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:34:42,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:34:42,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 02:34:44,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:34:47,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:47,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 02:34:49,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:54,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:34:59,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:35:00,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 02:35:03,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:03,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:35:06,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:08,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:35:08,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 02:35:08,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:35:08,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:10,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 02:35:13,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:13,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:16,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:16,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:16,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:20,312 INFO [train.py:1039] (2/4) Epoch 17, batch 350, loss[loss=0.1519, simple_loss=0.2288, pruned_loss=0.03748, over 22881.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2568, pruned_loss=0.05487, over 3904572.31 frames. ], batch size: 50, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:35:22,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:22,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:35:25,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:31,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:35,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:35,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:38,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 02:35:41,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:41,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 02:35:42,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=569026.6666666666, ans=0.125 2023-09-30 02:35:43,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:43,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 02:35:44,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:49,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 02:35:52,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:35:53,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:54,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=569093.3333333334, ans=0.125 2023-09-30 02:35:55,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:35:55,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:55,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:57,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:35:57,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:58,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:36:00,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:00,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:06,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:09,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:36:09,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:36:10,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:11,176 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.39 vs. limit=15.0 2023-09-30 02:36:16,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 02:36:16,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:20,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:20,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:20,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:36:21,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 02:36:23,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:23,973 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 02:36:27,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 02:36:27,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:30,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:36:30,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 02:36:32,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:32,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=569226.6666666666, ans=0.0 2023-09-30 02:36:33,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:36:35,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:37,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:37,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:41,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:43,983 INFO [train.py:1039] (2/4) Epoch 17, batch 400, loss[loss=0.1907, simple_loss=0.2792, pruned_loss=0.05108, over 24541.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2572, pruned_loss=0.05456, over 4096327.24 frames. ], batch size: 71, lr: 6.15e-03, grad_scale: 16.0 2023-09-30 02:36:44,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:44,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=569293.3333333334, ans=0.125 2023-09-30 02:36:44,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=569293.3333333334, ans=0.0 2023-09-30 02:36:45,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:36:47,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 02:36:47,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:47,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:50,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:36:50,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:51,857 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.862e+02 1.986e+02 2.213e+02 4.165e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 02:36:55,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:56,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=569293.3333333334, ans=0.05 2023-09-30 02:36:56,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=569293.3333333334, ans=0.0 2023-09-30 02:36:57,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:58,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 02:37:00,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 02:37:00,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:02,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 02:37:04,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:05,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:37:05,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:05,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 02:37:07,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:37:07,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:08,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:08,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:37:09,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=569360.0, ans=0.125 2023-09-30 02:37:12,428 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 02:37:13,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 02:37:17,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:18,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:19,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 02:37:22,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 02:37:25,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:37:29,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:33,312 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.49 vs. limit=6.0 2023-09-30 02:37:35,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 02:37:37,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=569493.3333333334, ans=0.0 2023-09-30 02:37:38,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:37:40,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 02:37:40,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:43,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:37:43,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 02:37:47,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:37:48,319 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.08 vs. limit=22.5 2023-09-30 02:37:51,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:37:52,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:54,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:55,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 02:37:57,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:37:58,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 02:37:59,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:37:59,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:38:02,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 02:38:05,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:38:05,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:38:07,150 INFO [train.py:1039] (2/4) Epoch 17, batch 450, loss[loss=0.1911, simple_loss=0.2593, pruned_loss=0.06145, over 23452.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2575, pruned_loss=0.05422, over 4239485.79 frames. ], batch size: 285, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:38:07,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:38:07,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 02:38:07,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:38:07,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=569626.6666666666, ans=0.1 2023-09-30 02:38:09,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:38:09,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:38:09,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 02:38:09,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:38:10,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.49 vs. limit=12.0 2023-09-30 02:38:11,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:38:14,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:38:24,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:24,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:38:26,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 02:38:26,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=569693.3333333334, ans=0.125 2023-09-30 02:38:28,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 02:38:29,990 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:38:32,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:38:32,909 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=569693.3333333334, ans=0.0 2023-09-30 02:38:36,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:37,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:42,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:42,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:45,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 02:38:45,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 02:38:47,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 02:38:48,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:38:48,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:50,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:38:52,644 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 02:38:52,658 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 02:38:52,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:54,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:38:57,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:39:00,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:39:02,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:39:02,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:39:03,127 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.36 vs. limit=15.0 2023-09-30 02:39:03,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 02:39:05,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:08,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:39:09,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:39:11,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 02:39:16,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:39:16,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 02:39:18,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 02:39:18,721 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.83 vs. limit=15.0 2023-09-30 02:39:20,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:23,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:39:26,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:27,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:39:27,930 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 02:39:29,816 INFO [train.py:1039] (2/4) Epoch 17, batch 500, loss[loss=0.1732, simple_loss=0.2415, pruned_loss=0.05239, over 23624.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2584, pruned_loss=0.05463, over 4353064.27 frames. ], batch size: 149, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:39:33,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:33,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:39:33,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:35,214 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 02:39:36,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 02:39:36,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:39,662 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.860e+02 2.180e+02 2.486e+02 3.417e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:39:39,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:39:43,280 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=569960.0, ans=0.1 2023-09-30 02:39:44,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:39:46,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:39:46,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=570026.6666666666, ans=0.0 2023-09-30 02:39:47,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:47,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:47,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:39:59,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:39:59,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:40:01,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:40:01,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:01,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 02:40:01,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:40:03,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=570093.3333333334, ans=0.125 2023-09-30 02:40:05,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:40:05,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:40:05,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:40:05,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:07,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 02:40:10,565 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 02:40:13,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:15,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:17,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:40:20,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 02:40:24,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:40:25,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:31,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:34,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:36,981 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.59 vs. limit=15.0 2023-09-30 02:40:40,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:45,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 02:40:45,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:45,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:47,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 02:40:48,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:40:48,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:53,238 INFO [train.py:1039] (2/4) Epoch 17, batch 550, loss[loss=0.2062, simple_loss=0.273, pruned_loss=0.06969, over 23532.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.26, pruned_loss=0.0562, over 4427194.57 frames. ], batch size: 256, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:40:54,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 02:40:56,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 02:40:56,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:56,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 02:40:57,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:40:57,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:59,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:01,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:01,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:41:01,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=570293.3333333334, ans=0.125 2023-09-30 02:41:02,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:41:05,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:41:06,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 02:41:06,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:41:09,862 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=570360.0, ans=0.125 2023-09-30 02:41:11,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:11,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:13,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:14,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:20,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 02:41:20,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 02:41:23,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:41:27,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:41:27,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:29,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:41:34,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:34,608 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 02:41:36,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:37,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:41:42,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:42,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:41:42,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:41:44,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:45,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 02:41:47,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 02:41:47,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:41:47,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:47,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:41:47,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:41:51,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:41:51,447 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:41:54,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:41:57,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:41:57,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:59,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:42:00,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:42:02,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:02,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:42:03,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:03,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:42:03,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:42:12,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 02:42:14,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-09-30 02:42:15,725 INFO [train.py:1039] (2/4) Epoch 17, batch 600, loss[loss=0.1804, simple_loss=0.2502, pruned_loss=0.05534, over 23643.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2601, pruned_loss=0.05601, over 4498242.29 frames. ], batch size: 149, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:42:15,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 02:42:16,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:42:17,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:42:17,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:26,649 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 1.971e+02 2.265e+02 3.697e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 02:42:26,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:42:28,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:42:30,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 02:42:31,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:42:33,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:42:36,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:37,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 02:42:39,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:42:41,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=570693.3333333334, ans=0.125 2023-09-30 02:42:45,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 02:42:47,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=570760.0, ans=0.0 2023-09-30 02:42:48,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:42:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:48,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:42:56,829 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.03 vs. limit=15.0 2023-09-30 02:42:57,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:42:57,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:42:57,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:03,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=570760.0, ans=0.0 2023-09-30 02:43:04,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:43:06,667 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.38 vs. limit=15.0 2023-09-30 02:43:09,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:09,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:43:09,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:43:18,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 02:43:23,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.42 vs. limit=10.0 2023-09-30 02:43:23,135 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.59 vs. limit=15.0 2023-09-30 02:43:24,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:43:24,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:43:29,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 02:43:31,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:43:33,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 02:43:33,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:43:34,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:43:39,677 INFO [train.py:1039] (2/4) Epoch 17, batch 650, loss[loss=0.1644, simple_loss=0.2435, pruned_loss=0.04263, over 24480.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2586, pruned_loss=0.05553, over 4530627.42 frames. ], batch size: 63, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:43:42,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:43:42,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:43:44,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:43:45,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=570960.0, ans=0.125 2023-09-30 02:43:46,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:43:49,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:43:51,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 02:43:52,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:57,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:43:57,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:01,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:06,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 02:44:07,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:07,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:11,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:11,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:44:14,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:14,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:14,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:44:16,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:17,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:44:17,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:44:19,401 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 02:44:19,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:19,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:24,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:24,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:26,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:26,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:44:27,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 02:44:29,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:44:29,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:44:30,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:44:30,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:32,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:44:34,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 02:44:36,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 02:44:37,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:37,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:37,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:44:39,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:40,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:47,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:47,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:49,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:53,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:53,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:44:53,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:45:01,941 INFO [train.py:1039] (2/4) Epoch 17, batch 700, loss[loss=0.1811, simple_loss=0.2662, pruned_loss=0.04797, over 24661.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2565, pruned_loss=0.055, over 4563586.36 frames. ], batch size: 68, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:45:02,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:45:02,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:02,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:02,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:02,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=571293.3333333334, ans=0.125 2023-09-30 02:45:06,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 02:45:08,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 02:45:11,125 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.22 vs. limit=12.0 2023-09-30 02:45:12,194 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.891e+02 2.090e+02 2.521e+02 3.567e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 02:45:12,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 02:45:12,684 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:45:13,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:15,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:45:16,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 02:45:20,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:23,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:45:25,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=571360.0, ans=0.125 2023-09-30 02:45:26,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:28,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:45:28,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:45:30,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:33,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 02:45:33,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:45:34,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 02:45:38,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 02:45:38,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=571426.6666666666, ans=0.0 2023-09-30 02:45:42,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:45:42,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:45:45,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:45:45,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=571426.6666666666, ans=0.125 2023-09-30 02:45:48,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:45:50,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 02:45:55,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:55,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:45:55,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 02:45:58,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:46:00,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:03,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:08,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:46:09,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.54 vs. limit=15.0 2023-09-30 02:46:09,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 02:46:13,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 02:46:13,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 02:46:13,748 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=571560.0, ans=0.1 2023-09-30 02:46:15,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:19,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:19,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:46:22,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:22,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 02:46:25,728 INFO [train.py:1039] (2/4) Epoch 17, batch 750, loss[loss=0.1972, simple_loss=0.2611, pruned_loss=0.06663, over 23700.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2557, pruned_loss=0.05469, over 4591265.35 frames. ], batch size: 164, lr: 6.14e-03, grad_scale: 4.0 2023-09-30 02:46:28,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 02:46:28,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 02:46:28,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 02:46:30,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 02:46:30,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 02:46:31,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:46:32,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 02:46:33,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:35,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:46:36,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:38,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:38,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:46:38,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:42,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:46:42,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:46:45,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:46:48,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:48,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:49,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 02:46:50,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=571693.3333333334, ans=0.0 2023-09-30 02:46:51,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:46:51,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:53,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:57,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:46:59,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 02:46:59,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:00,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 02:47:00,823 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 02:47:00,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 02:47:00,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:47:00,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:47:01,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=571760.0, ans=0.1 2023-09-30 02:47:04,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:47:10,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:47:11,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:11,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:47:13,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:47:17,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:17,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 02:47:17,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:47:18,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:47:20,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:47:23,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:47:24,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 02:47:24,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:31,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:47:33,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:47:33,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:47:36,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:47:39,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 02:47:40,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:40,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:45,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=571893.3333333334, ans=0.0 2023-09-30 02:47:47,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:48,748 INFO [train.py:1039] (2/4) Epoch 17, batch 800, loss[loss=0.2114, simple_loss=0.281, pruned_loss=0.0709, over 22815.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2568, pruned_loss=0.05531, over 4614738.95 frames. ], batch size: 322, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:47:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:47:55,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:55,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:56,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:56,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:58,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:00,275 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.820e+02 2.048e+02 2.346e+02 3.292e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 02:48:00,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:02,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:05,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:05,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=572026.6666666666, ans=0.95 2023-09-30 02:48:07,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:48:10,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 02:48:12,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:13,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:48:15,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:48:15,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:16,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 02:48:16,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:18,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 02:48:21,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:23,495 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=12.0 2023-09-30 02:48:24,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:26,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:48:26,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:28,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:28,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:32,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:48:34,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:48:34,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:48:36,496 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 02:48:36,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 02:48:36,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:48:36,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:48:39,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:39,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:48:43,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=572160.0, ans=0.0 2023-09-30 02:48:44,863 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 02:48:45,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.11 vs. limit=22.5 2023-09-30 02:48:46,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 02:48:48,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:48:51,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:48:54,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:48:57,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=572226.6666666666, ans=0.0 2023-09-30 02:48:58,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:00,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 02:49:00,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:49:04,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 02:49:09,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:11,215 INFO [train.py:1039] (2/4) Epoch 17, batch 850, loss[loss=0.1938, simple_loss=0.2729, pruned_loss=0.05728, over 23991.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2571, pruned_loss=0.05471, over 4649565.21 frames. ], batch size: 86, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:49:11,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:49:12,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 02:49:12,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:49:14,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:14,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 02:49:14,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:14,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=572293.3333333334, ans=0.0 2023-09-30 02:49:16,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:49:18,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:19,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:49:21,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:49:23,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 02:49:24,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 02:49:24,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 02:49:26,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:26,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:49:29,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:29,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:29,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:49:29,916 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-09-30 02:49:35,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:35,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:37,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 02:49:40,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 02:49:44,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:44,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 02:49:47,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 02:49:49,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 02:49:51,538 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 02:49:51,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:49:51,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:49:51,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=572426.6666666666, ans=0.125 2023-09-30 02:49:52,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:49:54,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:56,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:56,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 02:49:56,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=572426.6666666666, ans=0.1 2023-09-30 02:49:59,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:50:01,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:02,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:50:02,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:50:04,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:50:06,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:50:06,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 02:50:11,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:50:11,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:12,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:50:12,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:14,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:14,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=572493.3333333334, ans=0.125 2023-09-30 02:50:15,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:50:19,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:50:21,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:50:21,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:22,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:50:24,180 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=572560.0, ans=0.0 2023-09-30 02:50:29,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:50:31,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:32,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 02:50:33,898 INFO [train.py:1039] (2/4) Epoch 17, batch 900, loss[loss=0.1737, simple_loss=0.2603, pruned_loss=0.04355, over 24644.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2592, pruned_loss=0.05592, over 4655015.33 frames. ], batch size: 68, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:50:33,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:34,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:37,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 02:50:43,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=572626.6666666666, ans=0.125 2023-09-30 02:50:44,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:50:45,736 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.971e+02 2.244e+02 2.720e+02 3.662e+02, threshold=4.487e+02, percent-clipped=0.0 2023-09-30 02:50:47,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:49,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 02:50:52,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:50:52,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 02:50:53,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:50:55,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:55,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:50:55,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:50:55,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:51:02,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=572693.3333333334, ans=0.0 2023-09-30 02:51:05,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:05,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:51:07,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:51:11,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:15,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 02:51:18,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:51:20,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:51:22,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:51:22,381 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 02:51:23,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 02:51:30,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:51:30,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:51:32,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:51:35,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=572826.6666666666, ans=0.125 2023-09-30 02:51:37,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=572826.6666666666, ans=0.1 2023-09-30 02:51:40,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:40,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:51:42,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 02:51:42,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:45,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 02:51:45,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=572893.3333333334, ans=0.1 2023-09-30 02:51:46,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:51:46,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:49,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:51:49,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:51:53,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 02:51:55,183 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 02:51:55,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:51:55,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 02:51:56,796 INFO [train.py:1039] (2/4) Epoch 17, batch 950, loss[loss=0.2153, simple_loss=0.2941, pruned_loss=0.06824, over 24403.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2602, pruned_loss=0.05661, over 4658360.98 frames. ], batch size: 77, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:51:58,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:02,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=572960.0, ans=0.125 2023-09-30 02:52:03,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 02:52:04,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=572960.0, ans=0.125 2023-09-30 02:52:08,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:10,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:52:12,579 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 02:52:18,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:18,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:18,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:18,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:52:20,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 02:52:22,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:52:22,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.01 vs. limit=22.5 2023-09-30 02:52:23,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:25,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=573026.6666666666, ans=0.2 2023-09-30 02:52:27,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 02:52:28,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:33,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:33,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:34,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:34,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 02:52:37,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:52:38,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:40,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:52:44,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:52:44,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:48,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 02:52:51,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 02:52:51,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:52:51,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:52:51,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:51,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:52:56,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 02:52:58,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:53:01,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:03,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:03,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 02:53:03,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:03,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:53:03,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 02:53:08,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:53:10,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:13,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=573226.6666666666, ans=0.2 2023-09-30 02:53:14,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:16,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 02:53:16,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 02:53:20,120 INFO [train.py:1039] (2/4) Epoch 17, batch 1000, loss[loss=0.1805, simple_loss=0.2679, pruned_loss=0.04652, over 24643.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2588, pruned_loss=0.05625, over 4675602.21 frames. ], batch size: 73, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:53:20,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:23,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 02:53:23,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:53:28,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:53:29,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=573293.3333333334, ans=0.125 2023-09-30 02:53:30,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 02:53:30,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 02:53:32,103 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.898e+02 2.133e+02 2.497e+02 3.739e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-30 02:53:36,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:36,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:37,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:42,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 02:53:45,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 02:53:46,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=573360.0, ans=0.125 2023-09-30 02:53:47,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 02:53:47,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:53:51,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 02:53:52,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 02:53:53,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 02:53:54,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:55,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:53:59,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=573426.6666666666, ans=0.0 2023-09-30 02:54:01,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=573426.6666666666, ans=0.125 2023-09-30 02:54:03,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:05,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:54:05,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:06,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:06,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 02:54:06,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:08,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:54:08,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:08,673 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 02:54:12,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 02:54:14,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 02:54:17,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 02:54:18,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:54:27,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:54:27,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:54:27,629 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:54:28,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 02:54:30,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:54:30,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 02:54:32,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 02:54:33,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:54:33,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:37,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:54:37,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=573560.0, ans=0.0 2023-09-30 02:54:39,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:54:39,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:41,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=573560.0, ans=0.0 2023-09-30 02:54:44,344 INFO [train.py:1039] (2/4) Epoch 17, batch 1050, loss[loss=0.1963, simple_loss=0.2782, pruned_loss=0.05716, over 24008.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2569, pruned_loss=0.05571, over 4685114.44 frames. ], batch size: 86, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:54:44,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:54:44,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:54:46,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=573626.6666666666, ans=0.2 2023-09-30 02:54:48,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:54:49,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:51,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:54:55,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:54:57,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:54:59,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:55:01,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:55:01,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:55:02,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:55:02,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 02:55:04,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:05,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 02:55:08,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:55:08,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 02:55:08,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 02:55:15,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:55:17,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:55:17,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:21,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 02:55:21,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 02:55:21,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:55:24,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 02:55:27,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=573760.0, ans=0.2 2023-09-30 02:55:28,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 02:55:28,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:55:29,542 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.04 vs. limit=15.0 2023-09-30 02:55:31,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:55:33,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 02:55:33,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:55:35,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:55:39,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:55:44,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 02:55:45,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 02:55:46,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 02:55:46,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:46,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:55:48,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 02:55:51,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:55:55,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:55,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:55:55,621 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.79 vs. limit=15.0 2023-09-30 02:55:56,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:55:56,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:00,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:00,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 02:56:00,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=573893.3333333334, ans=0.1 2023-09-30 02:56:01,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:56:01,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 02:56:01,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 02:56:03,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:56:06,394 INFO [train.py:1039] (2/4) Epoch 17, batch 1100, loss[loss=0.1854, simple_loss=0.2514, pruned_loss=0.05971, over 23578.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2562, pruned_loss=0.05466, over 4688943.60 frames. ], batch size: 256, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:56:06,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:12,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=573960.0, ans=0.0 2023-09-30 02:56:13,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:56:18,389 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.918e+02 2.180e+02 2.425e+02 3.491e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:56:20,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:56:20,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:56:20,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:21,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 02:56:23,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:56:25,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:56:28,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:56:31,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:56:31,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 02:56:32,421 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-09-30 02:56:32,524 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.74 vs. limit=6.0 2023-09-30 02:56:35,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:56:36,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:36,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:56:38,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:56:42,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:56:48,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:56:49,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 02:56:51,446 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 02:56:51,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:53,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=574093.3333333334, ans=0.0 2023-09-30 02:56:55,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:55,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:56:55,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:58,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 02:56:58,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:56:58,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:56:58,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:56:59,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:59,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 02:57:00,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=574160.0, ans=0.1 2023-09-30 02:57:01,571 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:57:06,460 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:57:06,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 02:57:09,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:57:14,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:57:18,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 02:57:18,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:57:18,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:20,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:22,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:22,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 02:57:24,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:57:24,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:27,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 02:57:27,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:57:27,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 02:57:28,978 INFO [train.py:1039] (2/4) Epoch 17, batch 1150, loss[loss=0.1799, simple_loss=0.2592, pruned_loss=0.0503, over 24462.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2563, pruned_loss=0.05438, over 4686071.76 frames. ], batch size: 63, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:57:29,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:57:29,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:57:29,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=574293.3333333334, ans=0.125 2023-09-30 02:57:30,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:57:36,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:39,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:57:42,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:42,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:57:44,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 02:57:44,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:57:44,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=574360.0, ans=0.0 2023-09-30 02:57:47,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 02:57:48,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:48,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:57:52,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 02:57:55,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:59,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:58:01,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:01,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 02:58:01,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:58:01,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:58:04,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 02:58:04,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:58:05,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:58:06,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=574426.6666666666, ans=0.125 2023-09-30 02:58:17,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:22,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:24,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 02:58:24,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:24,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:30,910 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 02:58:33,277 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.74 vs. limit=15.0 2023-09-30 02:58:33,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:42,082 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 02:58:42,301 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=574560.0, ans=0.125 2023-09-30 02:58:46,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:58:48,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:58:48,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:58:48,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:58:51,675 INFO [train.py:1039] (2/4) Epoch 17, batch 1200, loss[loss=0.1679, simple_loss=0.2387, pruned_loss=0.0486, over 24338.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2578, pruned_loss=0.05454, over 4696780.72 frames. ], batch size: 56, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 02:58:51,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:58:57,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:58:59,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:59:00,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:00,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:01,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:59:03,932 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.927e+02 2.114e+02 2.514e+02 4.321e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 02:59:04,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:59:05,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:59:07,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:07,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:10,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=574693.3333333334, ans=0.0 2023-09-30 02:59:11,710 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 02:59:13,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 02:59:17,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:59:20,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:59:21,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:22,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.15 vs. limit=10.0 2023-09-30 02:59:25,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:59:25,297 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 02:59:25,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:35,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:59:35,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:59:35,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 02:59:37,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:59:40,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 02:59:43,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 02:59:45,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:45,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:47,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:47,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:59:48,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:48,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:59:50,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:59:50,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 02:59:50,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:59:51,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:59:51,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:59:52,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=574826.6666666666, ans=0.2 2023-09-30 02:59:53,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:53,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:58,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:00:01,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:00:06,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 03:00:10,301 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 03:00:13,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:14,777 INFO [train.py:1039] (2/4) Epoch 17, batch 1250, loss[loss=0.2059, simple_loss=0.2693, pruned_loss=0.07122, over 23644.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2581, pruned_loss=0.0545, over 4712681.54 frames. ], batch size: 179, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:00:14,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:00:15,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:00:16,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:00:16,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=574960.0, ans=0.125 2023-09-30 03:00:21,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 03:00:24,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:00:26,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:27,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 03:00:30,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:00:32,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:00:36,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:00:37,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:37,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:00:37,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:40,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:00:43,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-09-30 03:00:44,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:00:46,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:00:46,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:47,961 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:48,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:00:51,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:00:51,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=575093.3333333334, ans=0.125 2023-09-30 03:00:52,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:00:57,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 03:00:59,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:01:02,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:02,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 03:01:02,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:01:02,480 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 03:01:03,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:03,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:07,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:12,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:12,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:01:13,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 03:01:15,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 03:01:15,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 03:01:18,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:21,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 03:01:21,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:23,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:01:23,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:01:24,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 03:01:24,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:01:24,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:01:24,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:01:26,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:27,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 03:01:31,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:31,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=575226.6666666666, ans=0.0 2023-09-30 03:01:33,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:01:34,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:01:37,395 INFO [train.py:1039] (2/4) Epoch 17, batch 1300, loss[loss=0.1897, simple_loss=0.2686, pruned_loss=0.05534, over 24323.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2588, pruned_loss=0.05483, over 4711043.06 frames. ], batch size: 77, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:01:38,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:01:42,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:42,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 03:01:45,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=575293.3333333334, ans=0.1 2023-09-30 03:01:47,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=575293.3333333334, ans=0.0 2023-09-30 03:01:48,563 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.803e+02 1.977e+02 2.132e+02 2.913e+02, threshold=3.954e+02, percent-clipped=0.0 2023-09-30 03:01:48,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:50,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:01:50,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:01:50,919 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.51 vs. limit=22.5 2023-09-30 03:01:54,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:54,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:01:54,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 03:01:59,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:02:00,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:02:02,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 03:02:05,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:02:08,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.86 vs. limit=12.0 2023-09-30 03:02:09,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:10,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:12,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:02:15,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:15,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:02:16,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:02:16,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 03:02:22,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:02:22,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:02:24,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 03:02:26,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:02:26,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:02:29,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:02:31,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 03:02:31,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:32,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 03:02:34,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:37,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:37,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:02:41,106 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=575493.3333333334, ans=0.0 2023-09-30 03:02:43,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 03:02:43,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 03:02:43,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=575560.0, ans=0.125 2023-09-30 03:02:44,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 03:02:49,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:02:49,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=575560.0, ans=0.0 2023-09-30 03:02:52,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 03:02:52,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:03:01,427 INFO [train.py:1039] (2/4) Epoch 17, batch 1350, loss[loss=0.1721, simple_loss=0.2308, pruned_loss=0.05669, over 23439.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2574, pruned_loss=0.05407, over 4722572.25 frames. ], batch size: 285, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:03:01,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=575626.6666666666, ans=0.125 2023-09-30 03:03:03,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 03:03:07,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:09,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:11,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:03:11,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:14,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:03:14,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 03:03:22,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:03:23,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:03:26,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 03:03:27,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:03:27,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:03:27,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 03:03:29,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 03:03:31,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 03:03:34,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:34,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 03:03:46,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:52,505 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=10.76 vs. limit=10.0 2023-09-30 03:03:53,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=575826.6666666666, ans=0.125 2023-09-30 03:03:56,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:56,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:03:58,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 03:04:01,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:02,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 03:04:02,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:04:04,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:04:06,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:04:10,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 03:04:11,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:04:17,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 03:04:19,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 03:04:24,889 INFO [train.py:1039] (2/4) Epoch 17, batch 1400, loss[loss=0.161, simple_loss=0.2383, pruned_loss=0.04183, over 24314.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2555, pruned_loss=0.05363, over 4703458.75 frames. ], batch size: 61, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:04:25,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 03:04:26,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:29,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:04:31,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:04:33,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 03:04:35,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 03:04:36,890 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.880e+02 2.143e+02 2.482e+02 5.482e+02, threshold=4.285e+02, percent-clipped=2.0 2023-09-30 03:04:48,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:04:50,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:04:52,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:04:52,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:04:55,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:04:57,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 03:05:06,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:08,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:11,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 03:05:11,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:05:13,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:05:13,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:05:15,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:15,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:05:17,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:05:17,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:05:17,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 03:05:17,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:05:17,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=576160.0, ans=0.05 2023-09-30 03:05:22,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:27,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:05:27,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=576160.0, ans=0.125 2023-09-30 03:05:36,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 03:05:36,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=576226.6666666666, ans=0.95 2023-09-30 03:05:37,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:05:37,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:05:41,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 03:05:43,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:44,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:05:46,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:05:48,361 INFO [train.py:1039] (2/4) Epoch 17, batch 1450, loss[loss=0.1827, simple_loss=0.2505, pruned_loss=0.05752, over 23654.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2548, pruned_loss=0.05351, over 4704915.84 frames. ], batch size: 232, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:05:48,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:05:48,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:50,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:05:55,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:56,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:05:57,605 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.54 vs. limit=15.0 2023-09-30 03:05:58,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:58,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 03:05:59,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:06:00,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=576293.3333333334, ans=0.125 2023-09-30 03:06:02,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 03:06:02,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:05,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:05,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 03:06:05,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:06,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:06:06,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 03:06:08,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:08,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:06:10,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:12,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:14,787 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=576360.0, ans=0.125 2023-09-30 03:06:17,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:06:17,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:06:19,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:06:21,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:22,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:22,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:06:24,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:24,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:28,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 03:06:31,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:35,199 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 03:06:35,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:36,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:06:37,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=576493.3333333334, ans=0.2 2023-09-30 03:06:38,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:39,384 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-09-30 03:06:41,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 03:06:44,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:46,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 03:06:46,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 03:06:47,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:51,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:06:51,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:53,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 03:06:56,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 03:06:56,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 03:06:58,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:58,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:06:58,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=576560.0, ans=0.125 2023-09-30 03:07:11,605 INFO [train.py:1039] (2/4) Epoch 17, batch 1500, loss[loss=0.2016, simple_loss=0.2687, pruned_loss=0.06724, over 23777.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2561, pruned_loss=0.0545, over 4688291.53 frames. ], batch size: 179, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:07:11,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 03:07:11,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:07:11,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:07:13,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:13,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:14,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:07:16,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 03:07:16,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:07:16,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:07:18,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:19,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:07:21,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:07:21,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:22,946 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 1.876e+02 2.186e+02 2.555e+02 3.680e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 03:07:26,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:26,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 03:07:27,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:07:27,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:07:29,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:35,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 03:07:39,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 03:07:42,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:42,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 03:07:45,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:07:48,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:07:49,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:49,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:07:51,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 03:07:52,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:07:54,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:07:54,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 03:07:54,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:08:01,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:08:01,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 03:08:07,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:08:09,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:08:14,748 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 03:08:14,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:14,842 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 03:08:15,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=576826.6666666666, ans=0.0 2023-09-30 03:08:15,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=576826.6666666666, ans=0.125 2023-09-30 03:08:17,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:19,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:08:19,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=576893.3333333334, ans=0.2 2023-09-30 03:08:21,469 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 03:08:21,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:08:24,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 03:08:26,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:29,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:29,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=576893.3333333334, ans=0.125 2023-09-30 03:08:30,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.08 vs. limit=12.0 2023-09-30 03:08:30,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:30,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:30,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:32,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:08:34,391 INFO [train.py:1039] (2/4) Epoch 17, batch 1550, loss[loss=0.1672, simple_loss=0.25, pruned_loss=0.04218, over 24316.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2561, pruned_loss=0.05389, over 4694701.92 frames. ], batch size: 61, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:08:34,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 03:08:36,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 03:08:36,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:08:36,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 03:08:37,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 03:08:39,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:40,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:42,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:08:42,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:08:43,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:45,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:49,568 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 03:08:49,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:49,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:08:51,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:08:51,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=577026.6666666666, ans=0.0 2023-09-30 03:08:53,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:08:53,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 03:08:53,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=577026.6666666666, ans=0.0 2023-09-30 03:08:56,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:56,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 03:08:58,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 03:08:58,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 03:08:58,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:59,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:02,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:09:05,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 03:09:05,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 03:09:10,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=577093.3333333334, ans=0.125 2023-09-30 03:09:16,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:18,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:09:19,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:09:19,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:09:19,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 03:09:25,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:09:25,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=577160.0, ans=0.125 2023-09-30 03:09:26,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:31,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:09:31,469 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=577160.0, ans=0.2 2023-09-30 03:09:34,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:09:34,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:34,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 03:09:34,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:36,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:09:38,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:38,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:09:38,165 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 03:09:41,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:09:43,830 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=15.0 2023-09-30 03:09:47,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 03:09:52,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:54,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:54,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 03:09:55,669 INFO [train.py:1039] (2/4) Epoch 17, batch 1600, loss[loss=0.2005, simple_loss=0.266, pruned_loss=0.06754, over 22741.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.257, pruned_loss=0.05417, over 4700420.91 frames. ], batch size: 322, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:09:55,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:55,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:55,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:09:56,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:09:57,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:10:01,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:02,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 03:10:04,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 03:10:06,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 03:10:06,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:07,919 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.883e+02 2.101e+02 2.421e+02 4.828e+02, threshold=4.202e+02, percent-clipped=4.0 2023-09-30 03:10:08,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 03:10:08,861 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.91 vs. limit=15.0 2023-09-30 03:10:09,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:10:11,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:10:16,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:10:19,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 03:10:22,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:10:22,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 03:10:24,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:24,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 03:10:30,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=577426.6666666666, ans=0.125 2023-09-30 03:10:31,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 03:10:40,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:40,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 03:10:41,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=577426.6666666666, ans=0.125 2023-09-30 03:10:42,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:42,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:42,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:10:45,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:10:49,750 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.37 vs. limit=15.0 2023-09-30 03:10:50,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:10:53,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:10:53,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:55,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:55,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:10:58,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:10:58,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:11:01,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:11:07,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:09,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:11:11,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 03:11:11,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:11:12,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 03:11:18,129 INFO [train.py:1039] (2/4) Epoch 17, batch 1650, loss[loss=0.1596, simple_loss=0.2356, pruned_loss=0.0418, over 24545.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2576, pruned_loss=0.0541, over 4714011.92 frames. ], batch size: 60, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:11:18,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:21,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:11:21,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:11:21,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 03:11:23,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 03:11:23,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 03:11:23,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 03:11:23,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=577626.6666666666, ans=0.125 2023-09-30 03:11:27,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:29,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:29,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:11:29,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:11:31,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:32,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 03:11:35,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:11:35,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:35,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:11:36,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:11:37,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 03:11:37,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 03:11:45,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:11:46,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=577693.3333333334, ans=0.125 2023-09-30 03:11:48,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:11:57,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 03:11:59,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:00,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 03:12:04,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:07,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:12:07,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:12:07,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:08,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:12:08,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:11,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:11,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:13,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:13,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:14,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:14,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:12:15,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=577826.6666666666, ans=0.125 2023-09-30 03:12:19,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:19,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 03:12:20,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:22,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 03:12:22,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 03:12:22,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 03:12:22,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:24,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:12:24,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:24,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:24,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 03:12:29,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:30,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:12:31,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:32,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 03:12:35,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=577893.3333333334, ans=0.125 2023-09-30 03:12:38,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:38,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:12:39,907 INFO [train.py:1039] (2/4) Epoch 17, batch 1700, loss[loss=0.1659, simple_loss=0.2218, pruned_loss=0.055, over 22679.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2573, pruned_loss=0.05417, over 4718346.03 frames. ], batch size: 322, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:12:39,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 03:12:40,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:12:40,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:12:40,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:42,171 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.20 vs. limit=15.0 2023-09-30 03:12:43,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:12:43,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:12:44,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 03:12:48,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:12:51,380 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.846e+02 2.021e+02 2.222e+02 3.253e+02, threshold=4.041e+02, percent-clipped=0.0 2023-09-30 03:12:56,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:00,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:13:07,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:13:07,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:08,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:13:08,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:10,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 03:13:11,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:13:12,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=578093.3333333334, ans=0.125 2023-09-30 03:13:13,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:13,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:13:14,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:13:17,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 03:13:17,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 03:13:20,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:21,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=578093.3333333334, ans=0.125 2023-09-30 03:13:23,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 03:13:24,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:13:32,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:35,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:35,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:37,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:13:37,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 03:13:38,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:40,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:40,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 03:13:41,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:13:41,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:41,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:41,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:13:43,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:43,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:13:45,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:45,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:13:46,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:51,785 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:13:53,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:53,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 03:13:54,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:56,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:58,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 03:13:59,592 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=578226.6666666666, ans=0.0 2023-09-30 03:14:02,704 INFO [train.py:1039] (2/4) Epoch 17, batch 1750, loss[loss=0.18, simple_loss=0.2422, pruned_loss=0.05895, over 22812.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2563, pruned_loss=0.05369, over 4718180.09 frames. ], batch size: 322, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:14:03,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=578293.3333333334, ans=0.1 2023-09-30 03:14:03,640 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.55 vs. limit=15.0 2023-09-30 03:14:04,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:06,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:06,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:14:08,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 03:14:09,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:14:13,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:14:14,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:17,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 03:14:20,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:24,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 03:14:24,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:27,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:14:30,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:14:30,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 03:14:31,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:14:31,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 03:14:42,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:14:43,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:14:43,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:48,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:48,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:50,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:14:51,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:54,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:14:54,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:56,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 03:14:59,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:15:02,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 03:15:04,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:04,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:04,686 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=578493.3333333334, ans=0.125 2023-09-30 03:15:05,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:15:09,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:15:09,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 03:15:11,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:13,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:13,708 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:15:17,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:20,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:21,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:15:23,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 03:15:23,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:24,642 INFO [train.py:1039] (2/4) Epoch 17, batch 1800, loss[loss=0.1671, simple_loss=0.2498, pruned_loss=0.04225, over 24313.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2561, pruned_loss=0.05355, over 4716449.08 frames. ], batch size: 61, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:15:24,938 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=578626.6666666666, ans=0.125 2023-09-30 03:15:26,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:15:26,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:26,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:15:26,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:15:27,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:15:30,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:15:30,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:32,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:15:34,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:36,403 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.847e+02 2.029e+02 2.247e+02 3.215e+02, threshold=4.058e+02, percent-clipped=0.0 2023-09-30 03:15:36,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:15:39,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:15:41,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:15:43,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=578693.3333333334, ans=0.125 2023-09-30 03:15:44,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:44,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:45,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:15:48,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:48,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 03:15:50,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:54,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:55,944 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.99 vs. limit=15.0 2023-09-30 03:15:57,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 03:16:01,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 03:16:01,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 03:16:01,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:02,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:16:02,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:04,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:16:11,398 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 03:16:12,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:16:13,875 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.37 vs. limit=12.0 2023-09-30 03:16:14,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:16,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 03:16:17,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 03:16:17,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:16:18,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:16:20,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:16:24,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=578826.6666666666, ans=0.1 2023-09-30 03:16:26,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 03:16:31,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:16:31,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 03:16:32,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:16:32,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:32,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:16:32,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 03:16:33,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=578893.3333333334, ans=0.0 2023-09-30 03:16:36,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=578893.3333333334, ans=0.125 2023-09-30 03:16:37,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:16:37,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:16:39,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 03:16:39,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:42,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:42,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:16:42,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:44,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:45,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:16:46,228 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=578960.0, ans=0.0 2023-09-30 03:16:47,412 INFO [train.py:1039] (2/4) Epoch 17, batch 1850, loss[loss=0.187, simple_loss=0.2753, pruned_loss=0.04936, over 24327.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2564, pruned_loss=0.0537, over 4720692.92 frames. ], batch size: 77, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:16:47,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:47,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:50,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:16:52,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:16:52,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=578960.0, ans=0.125 2023-09-30 03:17:01,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:17:01,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 03:17:06,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 03:17:08,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 03:17:12,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:12,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 03:17:12,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 03:17:22,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:17:24,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 03:17:27,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:17:28,464 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.10 vs. limit=12.0 2023-09-30 03:17:29,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:17:32,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 03:17:34,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:34,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:17:36,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:17:37,251 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.62 vs. limit=15.0 2023-09-30 03:17:38,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:17:41,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:17:45,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:17:45,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:45,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:17:45,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:48,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:48,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:17:52,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=579226.6666666666, ans=0.0 2023-09-30 03:17:54,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 03:17:54,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:57,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:17:57,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:17:57,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 03:17:57,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 03:18:00,343 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 03:18:00,464 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 03:18:02,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:18:02,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:18:04,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:04,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:04,162 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 03:18:04,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:18:05,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:07,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:18:09,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:18:10,949 INFO [train.py:1039] (2/4) Epoch 17, batch 1900, loss[loss=0.1675, simple_loss=0.2513, pruned_loss=0.04186, over 24476.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2567, pruned_loss=0.05394, over 4730452.88 frames. ], batch size: 66, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:18:11,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:18:11,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 03:18:14,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:14,143 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 03:18:14,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:18:15,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:20,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:23,279 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.816e+02 1.988e+02 2.233e+02 2.900e+02, threshold=3.976e+02, percent-clipped=0.0 2023-09-30 03:18:23,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:18:24,941 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 03:18:26,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 03:18:28,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:30,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:18:30,139 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 03:18:30,227 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 03:18:32,689 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.99 vs. limit=15.0 2023-09-30 03:18:33,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 03:18:35,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:18:35,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=579360.0, ans=0.0 2023-09-30 03:18:40,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 03:18:42,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 03:18:42,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=579426.6666666666, ans=0.0 2023-09-30 03:18:52,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 03:18:55,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 03:18:55,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:55,151 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 03:18:55,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=579426.6666666666, ans=0.125 2023-09-30 03:18:55,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=579426.6666666666, ans=0.0 2023-09-30 03:18:56,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 03:18:56,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 03:18:56,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 03:18:56,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:01,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 03:19:04,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:19:08,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:08,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 03:19:11,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:19:14,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 03:19:14,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:21,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=579560.0, ans=0.5 2023-09-30 03:19:23,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:19:23,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:19:23,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:19:24,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:19:26,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:19:26,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:19:26,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=579560.0, ans=0.0 2023-09-30 03:19:27,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:19:28,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=579560.0, ans=0.1 2023-09-30 03:19:29,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:29,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:19:32,262 INFO [train.py:1039] (2/4) Epoch 17, batch 1950, loss[loss=0.1706, simple_loss=0.2468, pruned_loss=0.04725, over 24319.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2577, pruned_loss=0.05416, over 4734572.54 frames. ], batch size: 56, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:19:32,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:19:32,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:32,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:34,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:37,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:40,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:19:40,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:40,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:19:42,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 03:19:42,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:19:44,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:45,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:48,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:19:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:19:51,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:52,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:19:55,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:55,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:19:55,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:19:55,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:00,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:03,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:20:03,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:03,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:20:03,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 03:20:04,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:20:05,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:20:06,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:06,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=579760.0, ans=0.2 2023-09-30 03:20:11,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:14,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:20:19,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:20:20,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:20:22,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:20:22,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 03:20:22,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:20:27,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:20:28,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:20:30,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:38,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:40,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:43,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:44,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:48,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:20:48,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:48,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 03:20:48,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:20:50,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:51,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 03:20:52,954 INFO [train.py:1039] (2/4) Epoch 17, batch 2000, loss[loss=0.1648, simple_loss=0.2359, pruned_loss=0.0468, over 24441.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2578, pruned_loss=0.05466, over 4728733.21 frames. ], batch size: 58, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:20:54,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:20:58,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:59,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:20:59,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:01,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:21:03,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:07,543 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.864e+02 2.151e+02 2.440e+02 3.319e+02, threshold=4.303e+02, percent-clipped=0.0 2023-09-30 03:21:07,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 03:21:07,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:21:09,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=580026.6666666666, ans=0.1 2023-09-30 03:21:12,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:21:13,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 03:21:13,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:21:14,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:21:18,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:21:19,145 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.61 vs. limit=10.0 2023-09-30 03:21:19,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 03:21:21,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:26,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 03:21:26,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:21:28,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 03:21:28,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:33,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:21:33,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:21:33,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:33,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:36,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:36,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 03:21:40,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 03:21:40,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:40,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:21:45,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:47,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:21:47,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:48,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:48,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:50,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:50,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:50,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:51,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:55,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:57,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 03:22:03,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:22:05,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:22:12,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:15,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:15,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:17,175 INFO [train.py:1039] (2/4) Epoch 17, batch 2050, loss[loss=0.1833, simple_loss=0.2587, pruned_loss=0.05393, over 23416.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2575, pruned_loss=0.05465, over 4720063.55 frames. ], batch size: 105, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:22:17,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:22:17,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:22:18,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:20,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:22,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:23,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:28,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:22:31,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:22:31,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:33,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:22:33,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 03:22:35,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:22:35,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:22:36,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:22:37,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=580360.0, ans=0.025 2023-09-30 03:22:44,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=580360.0, ans=0.2 2023-09-30 03:22:45,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:45,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:49,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 03:22:52,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:53,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 03:22:54,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:54,622 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-09-30 03:22:57,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:22:58,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:22:59,661 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.92 vs. limit=15.0 2023-09-30 03:23:00,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:23:00,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:23:01,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:23:03,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:23:05,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:23:06,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.32 vs. limit=15.0 2023-09-30 03:23:06,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:08,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:23:09,760 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.59 vs. limit=12.0 2023-09-30 03:23:12,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:23:13,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:23:18,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:22,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:23:23,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 03:23:24,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=580560.0, ans=0.0 2023-09-30 03:23:30,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:30,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:23:33,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:23:35,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 03:23:36,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=580560.0, ans=0.125 2023-09-30 03:23:40,050 INFO [train.py:1039] (2/4) Epoch 17, batch 2100, loss[loss=0.1613, simple_loss=0.2384, pruned_loss=0.04208, over 24472.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2554, pruned_loss=0.05467, over 4697011.79 frames. ], batch size: 58, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:23:40,308 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 03:23:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:40,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:42,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:23:42,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:42,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 03:23:43,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 03:23:44,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:47,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=580626.6666666666, ans=0.125 2023-09-30 03:23:48,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:23:48,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:23:51,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:51,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:23:51,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 03:23:53,675 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.830e+02 2.027e+02 2.301e+02 3.593e+02, threshold=4.054e+02, percent-clipped=0.0 2023-09-30 03:23:53,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:23:53,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 03:23:53,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 03:23:55,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:23:55,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:23:55,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 03:23:57,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 03:23:58,381 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.01 vs. limit=6.0 2023-09-30 03:24:04,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 03:24:04,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:24:07,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:07,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:24:11,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:24:11,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 03:24:13,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:13,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 03:24:13,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 03:24:15,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:15,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 03:24:15,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 03:24:15,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 03:24:16,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:24:19,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:24:21,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:23,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:24,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:24,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=580760.0, ans=0.0 2023-09-30 03:24:26,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:26,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 03:24:28,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:28,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=580826.6666666666, ans=0.125 2023-09-30 03:24:29,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:29,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 03:24:32,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 03:24:32,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=580826.6666666666, ans=0.125 2023-09-30 03:24:33,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 03:24:36,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:24:39,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:24:39,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 03:24:47,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:48,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:24:50,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:24:50,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:24:50,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:24:51,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:24:51,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:53,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:24:53,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:24:53,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:54,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 03:24:56,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 03:24:56,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:58,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:58,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:25:00,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:25:00,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:25:03,747 INFO [train.py:1039] (2/4) Epoch 17, batch 2150, loss[loss=0.2076, simple_loss=0.2877, pruned_loss=0.06376, over 24590.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2541, pruned_loss=0.05457, over 4678368.26 frames. ], batch size: 71, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:25:05,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:25:07,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:08,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:08,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:25:08,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:10,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:25:13,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:15,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:25:15,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:25:20,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:20,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 03:25:23,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:23,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=581026.6666666666, ans=0.2 2023-09-30 03:25:24,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:25:25,156 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:25:26,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:26,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:26,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:26,861 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=581026.6666666666, ans=0.125 2023-09-30 03:25:27,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:25:28,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:28,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:25:29,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:25:31,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 03:25:33,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:25:34,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:34,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:34,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:25:35,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:25:38,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:40,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:25:41,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:41,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 03:25:43,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:25:46,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:46,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:48,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:50,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:25:50,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:52,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:52,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 03:25:52,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=581160.0, ans=0.1 2023-09-30 03:25:52,947 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.26 vs. limit=22.5 2023-09-30 03:25:53,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 03:25:53,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:25:55,177 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 03:25:55,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:55,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:25:56,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 03:25:56,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:25:56,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 03:25:56,851 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 03:25:56,852 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 03:25:56,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 03:25:59,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:01,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:26:01,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:26:02,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:03,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:26:06,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:06,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:13,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=581226.6666666666, ans=0.0 2023-09-30 03:26:16,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:26:16,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 03:26:19,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:26:24,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:25,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:26:26,985 INFO [train.py:1039] (2/4) Epoch 17, batch 2200, loss[loss=0.194, simple_loss=0.26, pruned_loss=0.06396, over 23758.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2539, pruned_loss=0.05367, over 4700960.97 frames. ], batch size: 179, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:26:27,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:26:27,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:26:30,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:30,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:26:30,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 03:26:33,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=581293.3333333334, ans=0.0 2023-09-30 03:26:36,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 03:26:39,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:26:41,771 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.818e+02 1.974e+02 2.282e+02 3.535e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 03:26:47,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 03:26:50,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:51,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:26:51,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:26:55,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:26:56,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 03:26:58,777 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=15.0 2023-09-30 03:27:01,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:27:01,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:03,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 03:27:05,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:27:07,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:08,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:27:10,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:13,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 03:27:14,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:16,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 03:27:19,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:19,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:27:19,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:21,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:27:23,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:23,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:23,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:26,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:27:26,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:27:28,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:27:32,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:27:32,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:27:35,245 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.31 vs. limit=10.0 2023-09-30 03:27:36,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:27:37,939 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 03:27:40,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:27:40,289 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 03:27:41,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:27:41,916 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 03:27:44,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:44,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:27:46,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:47,840 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 03:27:49,194 INFO [train.py:1039] (2/4) Epoch 17, batch 2250, loss[loss=0.1986, simple_loss=0.2635, pruned_loss=0.06688, over 23635.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2554, pruned_loss=0.05378, over 4710767.36 frames. ], batch size: 256, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:27:50,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:27:54,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:27:58,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:28:01,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:28:04,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:04,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:04,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=581693.3333333334, ans=0.125 2023-09-30 03:28:05,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:28:05,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=581693.3333333334, ans=0.125 2023-09-30 03:28:07,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 03:28:07,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:07,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:28:11,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 03:28:11,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:28:12,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:15,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:22,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:23,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:28:23,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:28:25,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 03:28:25,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:29,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:28:32,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:34,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:35,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:28:35,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:37,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:40,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:28:45,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:28:50,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:28:55,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:28:55,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:28:57,863 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.11 vs. limit=10.0 2023-09-30 03:28:58,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:29:03,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:29:05,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:29:05,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 03:29:05,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:06,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:29:07,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=581893.3333333334, ans=0.0 2023-09-30 03:29:09,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 03:29:10,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:29:10,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:11,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.95 vs. limit=15.0 2023-09-30 03:29:12,826 INFO [train.py:1039] (2/4) Epoch 17, batch 2300, loss[loss=0.1987, simple_loss=0.2775, pruned_loss=0.05998, over 24022.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2564, pruned_loss=0.05414, over 4720733.38 frames. ], batch size: 80, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:29:19,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:19,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:29:21,333 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 03:29:22,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:27,889 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.892e+02 2.152e+02 2.503e+02 3.822e+02, threshold=4.305e+02, percent-clipped=0.0 2023-09-30 03:29:28,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:29:29,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:29:29,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:29:29,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:29,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 03:29:31,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:29:32,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:29:34,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:29:38,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:29:42,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:29:46,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:29:51,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:29:53,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:57,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:30:00,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:02,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=582160.0, ans=0.125 2023-09-30 03:30:04,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:30:04,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:30:05,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:30:06,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 03:30:11,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:30:11,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:12,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:12,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:30:12,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:14,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:30:14,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:30:14,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 03:30:14,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:30:14,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:15,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 03:30:20,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=582226.6666666666, ans=0.025 2023-09-30 03:30:23,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:30:25,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:30:29,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:29,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:30:30,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:30:33,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:30:33,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:30:33,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:30:35,066 INFO [train.py:1039] (2/4) Epoch 17, batch 2350, loss[loss=0.1951, simple_loss=0.2783, pruned_loss=0.05592, over 24401.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2573, pruned_loss=0.05428, over 4732862.63 frames. ], batch size: 77, lr: 6.08e-03, grad_scale: 16.0 2023-09-30 03:30:35,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 03:30:37,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=582293.3333333334, ans=0.0 2023-09-30 03:30:40,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:30:41,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 03:30:47,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 03:30:50,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:54,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:30:54,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:56,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 03:30:58,500 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:30:59,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:31:08,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 03:31:09,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:31:12,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:31:12,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:31:15,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:31:16,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 03:31:18,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:31:21,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:31:21,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:21,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:31:21,916 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.95 vs. limit=15.0 2023-09-30 03:31:24,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:31:26,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 03:31:26,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:31:29,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:31:29,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:31:31,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 03:31:31,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:31:32,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.52 vs. limit=6.0 2023-09-30 03:31:36,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 03:31:36,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:31:42,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 03:31:47,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 03:31:47,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:47,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 03:31:49,188 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 03:31:49,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 03:31:51,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 03:31:53,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=582560.0, ans=0.0 2023-09-30 03:31:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:31:57,587 INFO [train.py:1039] (2/4) Epoch 17, batch 2400, loss[loss=0.1765, simple_loss=0.2491, pruned_loss=0.05195, over 24281.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2569, pruned_loss=0.05366, over 4736691.84 frames. ], batch size: 56, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:31:57,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:32:02,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:32:04,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:32:06,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 03:32:06,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 03:32:13,859 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.898e+02 2.064e+02 2.332e+02 3.496e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 03:32:14,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:32:14,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:32:17,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 03:32:18,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:32:18,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 03:32:25,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=582693.3333333334, ans=0.125 2023-09-30 03:32:27,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:28,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 03:32:32,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=582760.0, ans=0.125 2023-09-30 03:32:35,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:32:38,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 03:32:42,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:32:42,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=582760.0, ans=0.2 2023-09-30 03:32:43,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:48,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:32:48,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 03:32:50,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:32:56,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:32:56,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=582826.6666666666, ans=0.125 2023-09-30 03:32:58,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=582826.6666666666, ans=0.05 2023-09-30 03:32:58,447 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=582826.6666666666, ans=0.05 2023-09-30 03:32:59,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:33:03,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:03,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:33:03,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:33:03,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:33:04,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:05,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:05,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:33:05,809 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.08 vs. limit=15.0 2023-09-30 03:33:11,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:11,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:33:11,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 03:33:13,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 03:33:16,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:33:16,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:16,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 03:33:18,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 03:33:18,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 03:33:18,463 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 03:33:19,604 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.99 vs. limit=6.0 2023-09-30 03:33:19,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 03:33:20,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:33:21,459 INFO [train.py:1039] (2/4) Epoch 17, batch 2450, loss[loss=0.1898, simple_loss=0.2766, pruned_loss=0.05155, over 24466.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2556, pruned_loss=0.05333, over 4735370.85 frames. ], batch size: 69, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:33:21,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:23,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:23,751 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 03:33:25,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:25,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:33:28,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:33:28,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:33,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:33,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:35,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 03:33:41,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:33:41,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:42,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:33:44,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:33:44,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:33:44,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 03:33:50,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:51,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:33:53,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:56,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:33:58,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:33:58,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:33:59,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:34:01,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 03:34:03,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:34:11,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:13,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:34:13,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:13,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:34:13,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:14,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:34:14,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 03:34:18,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:34:18,451 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:34:23,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:34:23,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:28,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:34:28,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 03:34:28,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:34:29,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:34:29,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 03:34:31,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:34:33,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:34:33,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=583226.6666666666, ans=0.125 2023-09-30 03:34:36,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:34:38,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:39,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:34:44,639 INFO [train.py:1039] (2/4) Epoch 17, batch 2500, loss[loss=0.163, simple_loss=0.2459, pruned_loss=0.04007, over 24489.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.255, pruned_loss=0.05284, over 4726674.77 frames. ], batch size: 66, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:34:44,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 03:34:44,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:34:44,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=583293.3333333334, ans=0.2 2023-09-30 03:34:50,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:34:51,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.93 vs. limit=15.0 2023-09-30 03:35:00,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:35:00,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:35:01,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:35:01,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 03:35:03,351 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.839e+02 2.064e+02 2.322e+02 3.484e+02, threshold=4.127e+02, percent-clipped=0.0 2023-09-30 03:35:09,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:35:11,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:11,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:35:11,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:35:11,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 03:35:13,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:14,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:15,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 03:35:15,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:16,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 03:35:16,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:21,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:35:23,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:25,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:35:27,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 03:35:27,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:35:28,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=583426.6666666666, ans=0.0 2023-09-30 03:35:28,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=583426.6666666666, ans=10.0 2023-09-30 03:35:30,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:30,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=583426.6666666666, ans=0.0 2023-09-30 03:35:30,847 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=583426.6666666666, ans=0.09899494936611666 2023-09-30 03:35:34,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:36,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_na.min_abs, batch_count=583493.3333333334, ans=0.02 2023-09-30 03:35:39,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:43,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:35:46,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=583493.3333333334, ans=0.125 2023-09-30 03:35:47,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:35:52,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 03:35:52,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:52,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:35:54,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:35:54,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:35:55,975 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 03:35:55,976 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 03:35:55,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 03:35:57,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:01,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 03:36:01,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 03:36:02,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:36:03,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 03:36:05,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 03:36:08,048 INFO [train.py:1039] (2/4) Epoch 17, batch 2550, loss[loss=0.1963, simple_loss=0.271, pruned_loss=0.06075, over 23419.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2554, pruned_loss=0.053, over 4725877.65 frames. ], batch size: 106, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:36:08,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:11,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:36:12,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:36:14,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:15,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 03:36:15,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:36:21,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 03:36:22,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:36:24,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:27,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:36:27,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 03:36:29,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:36:29,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:29,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:33,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:36:33,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 03:36:33,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:36:33,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:33,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 03:36:45,593 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-09-30 03:36:46,645 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=583760.0, ans=0.05 2023-09-30 03:36:47,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:36:52,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:36:53,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:53,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:54,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:37:00,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:37:03,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:37:03,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:37:03,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:37:03,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:37:05,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:37:10,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:10,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:15,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:37:15,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 03:37:15,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:37:15,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=583893.3333333334, ans=0.1 2023-09-30 03:37:16,053 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.44 vs. limit=15.0 2023-09-30 03:37:16,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:18,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:37:18,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:37:19,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:24,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:37:28,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:30,515 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.63 vs. limit=6.0 2023-09-30 03:37:31,100 INFO [train.py:1039] (2/4) Epoch 17, batch 2600, loss[loss=0.1951, simple_loss=0.2848, pruned_loss=0.05276, over 24664.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.256, pruned_loss=0.05345, over 4724448.16 frames. ], batch size: 73, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:37:31,278 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 03:37:36,318 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 03:37:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:37:36,411 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 03:37:36,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 03:37:36,570 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 03:37:41,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:41,364 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 03:37:43,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 03:37:44,868 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 03:37:46,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:37:48,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 03:37:49,474 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 2.083e+02 2.505e+02 2.924e+02 4.278e+02, threshold=5.011e+02, percent-clipped=1.0 2023-09-30 03:37:51,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 03:37:52,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:37:52,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 03:37:55,940 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 03:37:55,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 03:38:02,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:02,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:04,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:04,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 03:38:07,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:38:12,632 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 03:38:18,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:18,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:18,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 03:38:18,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:18,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:20,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 03:38:22,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:38:23,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:38:25,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:26,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=584160.0, ans=0.2 2023-09-30 03:38:29,537 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 03:38:29,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:29,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:38:31,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=584160.0, ans=0.0 2023-09-30 03:38:34,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:36,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:38:36,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 03:38:37,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:38,807 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.00 vs. limit=22.5 2023-09-30 03:38:39,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:38:40,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:38:46,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 03:38:47,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:50,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:38:53,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 03:38:53,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:55,187 INFO [train.py:1039] (2/4) Epoch 17, batch 2650, loss[loss=0.1634, simple_loss=0.2439, pruned_loss=0.04145, over 24656.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2574, pruned_loss=0.05371, over 4728741.43 frames. ], batch size: 60, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:38:55,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:38:55,373 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 03:38:55,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:38:58,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:58,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=584293.3333333334, ans=0.125 2023-09-30 03:39:01,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:39:01,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=584293.3333333334, ans=0.0 2023-09-30 03:39:02,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:39:05,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:39:07,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 03:39:07,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:39:08,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:39:09,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=584360.0, ans=0.2 2023-09-30 03:39:11,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 03:39:12,912 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 03:39:15,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:16,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=584360.0, ans=0.125 2023-09-30 03:39:18,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=584360.0, ans=0.0 2023-09-30 03:39:20,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 03:39:20,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:21,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 03:39:26,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:39:27,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:30,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 03:39:30,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 03:39:33,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:39:38,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 03:39:38,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:41,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:41,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:39:41,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:42,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:42,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:45,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:47,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:39:49,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:39:51,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:39:52,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:53,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:39:53,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:55,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:56,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:39:57,248 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=584493.3333333334, ans=0.125 2023-09-30 03:39:58,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:58,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:39:58,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:40:00,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 03:40:02,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:05,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:08,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:40:09,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:12,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:12,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 03:40:16,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:40:17,474 INFO [train.py:1039] (2/4) Epoch 17, batch 2700, loss[loss=0.1815, simple_loss=0.2398, pruned_loss=0.06162, over 22687.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2585, pruned_loss=0.0546, over 4723978.18 frames. ], batch size: 322, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:40:17,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 03:40:19,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:40:20,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:20,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:22,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:40:22,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:40:22,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:40:22,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:40:22,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 03:40:24,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:40:26,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:40:27,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:40:27,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:31,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:40:32,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 03:40:34,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:40:36,034 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.817e+02 2.004e+02 2.215e+02 2.992e+02, threshold=4.008e+02, percent-clipped=0.0 2023-09-30 03:40:40,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:40:40,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:40:45,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:40:45,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:47,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:40:47,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:40:50,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:40:53,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:53,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:40:53,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:40:59,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:59,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:41:07,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:41:07,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=584826.6666666666, ans=0.05 2023-09-30 03:41:09,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:41:12,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:41:12,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:18,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:18,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:18,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:41:19,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:21,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:22,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.08 vs. limit=22.5 2023-09-30 03:41:22,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=584893.3333333334, ans=22.5 2023-09-30 03:41:22,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:41:24,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:41:25,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.50 vs. limit=15.0 2023-09-30 03:41:27,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:27,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:30,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 03:41:32,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:34,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:41:34,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 03:41:36,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 03:41:37,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:39,338 INFO [train.py:1039] (2/4) Epoch 17, batch 2750, loss[loss=0.1843, simple_loss=0.2682, pruned_loss=0.05021, over 24004.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2579, pruned_loss=0.05396, over 4734722.54 frames. ], batch size: 80, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:41:40,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:41:41,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:41,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=584960.0, ans=0.125 2023-09-30 03:41:45,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:45,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:41:45,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=584960.0, ans=0.0 2023-09-30 03:41:47,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:50,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:41:50,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:41:52,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:41:52,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:52,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 03:41:52,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:41:52,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:57,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 03:42:00,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:42:00,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:00,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:01,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:42:01,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:42:03,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:42:03,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:05,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:10,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:42:12,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:42:12,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:42:13,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:15,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:42:15,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=585093.3333333334, ans=0.125 2023-09-30 03:42:21,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:25,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:42:25,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:28,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:28,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:42:29,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:42:30,609 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=12.0 2023-09-30 03:42:35,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:42:35,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:35,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 03:42:42,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:45,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 03:42:49,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:42:53,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:42:53,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 03:42:54,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:42:56,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:42:56,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 03:42:56,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:42:58,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:43:00,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:00,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:00,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 03:43:00,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:01,824 INFO [train.py:1039] (2/4) Epoch 17, batch 2800, loss[loss=0.1825, simple_loss=0.2573, pruned_loss=0.05383, over 23383.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2571, pruned_loss=0.05333, over 4734536.25 frames. ], batch size: 106, lr: 6.07e-03, grad_scale: 16.0 2023-09-30 03:43:01,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:03,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:04,921 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 03:43:04,922 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 03:43:08,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:09,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:43:11,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:43:12,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.14 vs. limit=10.0 2023-09-30 03:43:15,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:43:18,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 03:43:20,140 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.888e+02 2.133e+02 2.598e+02 4.037e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 03:43:20,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:43:22,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 03:43:22,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=585360.0, ans=0.2 2023-09-30 03:43:23,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:24,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:43:24,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:28,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:29,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:29,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:43:30,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=585360.0, ans=0.125 2023-09-30 03:43:31,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:43:39,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:43:39,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=585426.6666666666, ans=0.0 2023-09-30 03:43:42,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:45,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:45,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:43:46,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:52,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:43:52,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 03:43:52,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:54,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:54,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:43:54,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=585493.3333333334, ans=0.1 2023-09-30 03:43:59,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:59,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:04,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:44:06,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:44:07,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:07,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:44:07,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:44:09,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:44:10,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:44:10,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 03:44:10,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:12,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:44:12,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:15,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 03:44:15,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:15,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=585560.0, ans=0.125 2023-09-30 03:44:15,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=585560.0, ans=0.07 2023-09-30 03:44:16,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:44:16,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:44:18,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 03:44:23,657 INFO [train.py:1039] (2/4) Epoch 17, batch 2850, loss[loss=0.1872, simple_loss=0.2667, pruned_loss=0.05387, over 24386.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2562, pruned_loss=0.05292, over 4740907.86 frames. ], batch size: 77, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:44:23,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:44:23,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:44:25,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:44:27,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=585626.6666666666, ans=0.0 2023-09-30 03:44:28,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:29,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=585626.6666666666, ans=0.0 2023-09-30 03:44:32,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:44:32,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:44:32,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:44:35,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:36,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:38,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:44:40,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 03:44:43,243 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=15.0 2023-09-30 03:44:45,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 03:44:45,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:45,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=585693.3333333334, ans=0.1 2023-09-30 03:44:45,968 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.52 vs. limit=15.0 2023-09-30 03:44:48,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 03:44:49,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:51,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 03:44:52,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 03:44:54,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:58,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=585760.0, ans=0.125 2023-09-30 03:44:58,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=585760.0, ans=0.1 2023-09-30 03:45:07,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:08,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:10,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:45:11,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:45:11,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:45:11,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:45:12,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=585826.6666666666, ans=0.125 2023-09-30 03:45:14,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:45:14,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 03:45:16,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:45:18,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:18,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:18,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:21,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:21,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:23,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:23,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:26,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:45:26,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:27,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:30,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=585893.3333333334, ans=0.2 2023-09-30 03:45:31,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:45:35,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:45:36,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 03:45:38,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 03:45:39,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:45:41,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:41,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 03:45:41,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:45:42,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:42,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:43,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:45:43,492 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 03:45:43,571 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 03:45:43,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:45:43,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:46,530 INFO [train.py:1039] (2/4) Epoch 17, batch 2900, loss[loss=0.2042, simple_loss=0.2877, pruned_loss=0.06034, over 24057.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2559, pruned_loss=0.05268, over 4743214.92 frames. ], batch size: 80, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:45:50,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:45:50,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:50,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:53,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 03:45:56,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:56,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 03:45:58,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 03:45:59,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:45:59,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:46:01,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:01,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=586026.6666666666, ans=0.2 2023-09-30 03:46:03,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:46:06,437 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.832e+02 2.093e+02 2.427e+02 4.261e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 03:46:06,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:46:08,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:46:11,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:46:11,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 03:46:13,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:46:13,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:16,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 03:46:18,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 03:46:20,323 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.92 vs. limit=15.0 2023-09-30 03:46:21,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:46:21,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 03:46:21,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:46:22,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.15 vs. limit=10.0 2023-09-30 03:46:24,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:46:24,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:46:26,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:28,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:32,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:46:35,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:46:36,531 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.18 vs. limit=15.0 2023-09-30 03:46:38,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 03:46:38,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 03:46:38,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:46:42,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:46:44,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 03:46:47,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:46:50,743 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=586226.6666666666, ans=0.0 2023-09-30 03:46:52,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:53,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=586226.6666666666, ans=0.125 2023-09-30 03:46:54,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=586226.6666666666, ans=0.2 2023-09-30 03:46:54,988 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-09-30 03:47:02,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:47:02,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:47:02,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 03:47:04,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=586226.6666666666, ans=0.0 2023-09-30 03:47:05,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:05,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 03:47:07,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:07,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:47:08,575 INFO [train.py:1039] (2/4) Epoch 17, batch 2950, loss[loss=0.2306, simple_loss=0.2856, pruned_loss=0.0878, over 19707.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2573, pruned_loss=0.05353, over 4740926.71 frames. ], batch size: 388, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:47:15,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:15,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 03:47:17,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:17,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:19,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:47:20,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:47:20,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 03:47:22,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 03:47:22,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:47:22,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:30,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:31,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=586360.0, ans=0.125 2023-09-30 03:47:33,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:47:35,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:47:36,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:38,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:47:38,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:47:40,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=586426.6666666666, ans=0.125 2023-09-30 03:47:41,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:47:44,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 03:47:44,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=586426.6666666666, ans=0.125 2023-09-30 03:47:50,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 03:47:50,161 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 03:47:51,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:47:53,591 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 03:47:55,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 03:47:55,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:55,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:55,854 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 03:47:55,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:47:58,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 03:47:58,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:48:00,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:48:02,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:03,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:48:05,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:05,792 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 03:48:05,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:05,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 03:48:13,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:15,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:48:16,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 03:48:16,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:48:19,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 03:48:21,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:21,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=586560.0, ans=0.125 2023-09-30 03:48:23,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:48:24,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:48:26,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:26,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:48:27,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:48:27,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:27,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:48:30,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:48:31,974 INFO [train.py:1039] (2/4) Epoch 17, batch 3000, loss[loss=0.1876, simple_loss=0.2756, pruned_loss=0.04981, over 24312.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2582, pruned_loss=0.05452, over 4730662.71 frames. ], batch size: 74, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:48:31,974 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 03:48:47,114 INFO [train.py:1071] (2/4) Epoch 17, validation: loss=0.2916, simple_loss=0.2691, pruned_loss=0.1571, over 1125622.00 frames. 2023-09-30 03:48:47,115 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 03:48:47,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:48,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:48:50,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:50,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 03:48:51,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:53,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:48:54,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:49:01,550 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 03:49:01,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 03:49:05,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:49:05,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:49:07,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 03:49:07,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:10,633 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.835e+02 1.995e+02 2.224e+02 3.286e+02, threshold=3.989e+02, percent-clipped=0.0 2023-09-30 03:49:11,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=586693.3333333334, ans=0.0 2023-09-30 03:49:15,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:49:25,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:49:31,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 03:49:33,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:49:34,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:49:36,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:38,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:49:38,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:38,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 03:49:42,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 03:49:42,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:49:43,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:49:47,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:49:47,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:47,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:49:47,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:49:50,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:49:50,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:50,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:49:53,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:57,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 03:49:57,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:49:58,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:49:58,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:50:02,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:02,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:03,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:50:05,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 03:50:05,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:05,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=586893.3333333334, ans=0.125 2023-09-30 03:50:06,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 03:50:06,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:50:08,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 03:50:13,261 INFO [train.py:1039] (2/4) Epoch 17, batch 3050, loss[loss=0.182, simple_loss=0.2599, pruned_loss=0.05204, over 24476.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2594, pruned_loss=0.05489, over 4729514.11 frames. ], batch size: 63, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:50:13,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:13,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:50:13,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 03:50:15,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 03:50:15,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:50:16,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:50:16,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:18,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:50:18,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:18,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:50:21,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 03:50:23,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:50:26,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:26,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:50:31,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:34,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 03:50:39,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 03:50:39,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 03:50:40,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:50:41,519 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.48 vs. limit=22.5 2023-09-30 03:50:43,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:50:49,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:49,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:50,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=587093.3333333334, ans=0.125 2023-09-30 03:50:51,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:52,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:50:54,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:54,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:54,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:54,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:55,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:57,750 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=587093.3333333334, ans=0.125 2023-09-30 03:50:58,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:00,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:00,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 03:51:00,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:51:02,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:51:05,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:51:06,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:51:07,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:07,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:12,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:51:13,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:20,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:21,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:51:21,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:21,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:23,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:51:23,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:51:25,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 03:51:27,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:27,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:27,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 03:51:30,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:33,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:35,811 INFO [train.py:1039] (2/4) Epoch 17, batch 3100, loss[loss=0.1987, simple_loss=0.2508, pruned_loss=0.07328, over 20062.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2588, pruned_loss=0.0545, over 4731188.72 frames. ], batch size: 388, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:51:35,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:51:37,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:51:40,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 03:51:43,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 03:51:45,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 03:51:45,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:51:48,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:48,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:53,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:51:55,415 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.922e+02 2.326e+02 2.796e+02 3.777e+02, threshold=4.651e+02, percent-clipped=0.0 2023-09-30 03:51:56,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:58,500 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.68 vs. limit=15.0 2023-09-30 03:52:02,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 03:52:05,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:52:07,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:08,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:08,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:52:10,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:52:12,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:52:12,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 03:52:12,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:52:15,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:15,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 03:52:18,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:52:20,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:52:20,235 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=587426.6666666666, ans=0.0 2023-09-30 03:52:21,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 03:52:23,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 03:52:25,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:25,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:28,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:28,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:29,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=587493.3333333334, ans=0.1 2023-09-30 03:52:30,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:52:32,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:52:32,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:52:33,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:52:33,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:52:33,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:33,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 03:52:37,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:38,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 03:52:41,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:52:43,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 03:52:43,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:43,608 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=587560.0, ans=0.2 2023-09-30 03:52:45,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:45,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 03:52:56,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 03:52:58,043 INFO [train.py:1039] (2/4) Epoch 17, batch 3150, loss[loss=0.178, simple_loss=0.247, pruned_loss=0.05453, over 18358.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2579, pruned_loss=0.05418, over 4714298.07 frames. ], batch size: 40, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:52:58,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:52:59,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:03,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:53:03,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:53:04,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 03:53:05,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:06,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:53:07,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 03:53:10,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:11,745 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 03:53:12,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=587626.6666666666, ans=0.0 2023-09-30 03:53:14,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 03:53:14,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:53:16,376 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 03:53:16,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:53:16,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=587693.3333333334, ans=0.125 2023-09-30 03:53:18,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 03:53:18,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 03:53:18,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 03:53:18,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:18,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:19,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=587693.3333333334, ans=0.04949747468305833 2023-09-30 03:53:20,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:22,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 03:53:25,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:28,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:53:28,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=587693.3333333334, ans=0.125 2023-09-30 03:53:31,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 03:53:31,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:53:33,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=587760.0, ans=0.1 2023-09-30 03:53:36,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:53:36,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:38,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 03:53:42,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 03:53:42,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:53:42,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:53:43,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:53:43,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:43,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:53:45,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:53:45,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:53:45,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 03:53:46,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:53:47,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:49,499 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.69 vs. limit=6.0 2023-09-30 03:53:49,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:53:50,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:50,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 03:53:52,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:53:55,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 03:53:55,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:55,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 03:53:56,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 03:53:58,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:53:59,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:54:00,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 03:54:01,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:54:02,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:54:04,856 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-09-30 03:54:07,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:54:07,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:08,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:54:11,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=587893.3333333334, ans=0.125 2023-09-30 03:54:14,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:54:16,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:18,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:54:21,679 INFO [train.py:1039] (2/4) Epoch 17, batch 3200, loss[loss=0.1936, simple_loss=0.2815, pruned_loss=0.05283, over 24310.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2571, pruned_loss=0.05363, over 4722543.82 frames. ], batch size: 74, lr: 6.06e-03, grad_scale: 16.0 2023-09-30 03:54:23,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:54:23,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:54:26,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:28,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:54:28,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 03:54:31,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:54:32,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=587960.0, ans=0.125 2023-09-30 03:54:35,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=587960.0, ans=0.125 2023-09-30 03:54:37,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:54:40,745 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.992e+02 2.258e+02 2.770e+02 4.284e+02, threshold=4.516e+02, percent-clipped=0.0 2023-09-30 03:54:40,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:46,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=588026.6666666666, ans=0.09899494936611666 2023-09-30 03:54:49,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:54:59,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=588093.3333333334, ans=0.125 2023-09-30 03:55:00,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 03:55:00,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:55:00,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=588093.3333333334, ans=0.125 2023-09-30 03:55:04,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 03:55:05,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:55:08,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:55:08,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:55:10,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:55:15,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 03:55:17,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:55:20,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 03:55:23,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 03:55:25,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:55:30,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:55:31,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,877 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 03:55:31,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:55:34,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:55:37,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 03:55:37,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 03:55:38,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 03:55:40,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 03:55:42,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:55:43,572 INFO [train.py:1039] (2/4) Epoch 17, batch 3250, loss[loss=0.1667, simple_loss=0.2448, pruned_loss=0.04431, over 18864.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2568, pruned_loss=0.05325, over 4722582.28 frames. ], batch size: 41, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:55:43,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:55:43,828 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 03:55:43,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:55:45,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:55:45,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=588293.3333333334, ans=0.125 2023-09-30 03:55:46,750 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 03:55:50,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:55:53,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:55:58,234 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.21 vs. limit=15.0 2023-09-30 03:56:01,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:01,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 03:56:03,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:04,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:04,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:04,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:04,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:56:08,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:08,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:56:08,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:10,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:10,408 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:56:13,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:14,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:16,531 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=588426.6666666666, ans=0.1 2023-09-30 03:56:17,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:17,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:20,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:20,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:20,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:23,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=588426.6666666666, ans=0.125 2023-09-30 03:56:25,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 03:56:26,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:56:26,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:56:28,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:28,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:56:33,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=588493.3333333334, ans=0.0 2023-09-30 03:56:36,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:56:42,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:56:42,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:42,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 03:56:42,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:56:42,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:56:42,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:47,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 03:56:47,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 03:56:49,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:50,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:52,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:52,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:56:52,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:52,953 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.91 vs. limit=15.0 2023-09-30 03:56:57,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:57,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:59,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 03:56:59,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:02,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:57:02,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 03:57:04,846 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.13 vs. limit=15.0 2023-09-30 03:57:05,630 INFO [train.py:1039] (2/4) Epoch 17, batch 3300, loss[loss=0.1997, simple_loss=0.2664, pruned_loss=0.06652, over 22753.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2573, pruned_loss=0.05371, over 4729285.40 frames. ], batch size: 322, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:57:05,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:57:05,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 03:57:09,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 03:57:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 03:57:11,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:15,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:57:17,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:57:17,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:17,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:57:17,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=588626.6666666666, ans=0.0 2023-09-30 03:57:18,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:57:22,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:23,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:57:25,152 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.863e+02 2.079e+02 2.237e+02 3.389e+02, threshold=4.158e+02, percent-clipped=0.0 2023-09-30 03:57:26,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 03:57:28,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:28,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:29,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:30,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.17 vs. limit=12.0 2023-09-30 03:57:30,527 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 03:57:30,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:57:32,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:57:33,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:57:33,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:57:33,633 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 03:57:38,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:38,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:57:40,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:40,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 03:57:42,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 03:57:42,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:43,584 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.30 vs. limit=12.0 2023-09-30 03:57:44,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:57:45,734 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 03:57:45,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 03:57:47,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:57:50,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 03:57:52,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:57:54,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:57:54,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:57:54,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=588826.6666666666, ans=0.0 2023-09-30 03:57:57,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:58,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:58,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:58,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:58:01,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:58:02,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:02,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:58:05,417 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 03:58:06,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 03:58:08,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:58:09,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:09,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:12,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:58:12,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:13,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:58:15,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:15,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:58:15,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:16,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=588893.3333333334, ans=0.2 2023-09-30 03:58:18,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:58:21,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 03:58:21,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:23,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:25,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:58:25,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:58:26,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:28,073 INFO [train.py:1039] (2/4) Epoch 17, batch 3350, loss[loss=0.1631, simple_loss=0.2418, pruned_loss=0.04222, over 20126.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2568, pruned_loss=0.05323, over 4722179.78 frames. ], batch size: 43, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:58:30,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:30,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:34,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:58:34,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:36,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:58:39,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:40,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=588960.0, ans=0.125 2023-09-30 03:58:41,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:58:41,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=588960.0, ans=0.125 2023-09-30 03:58:43,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:43,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:58:44,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 03:58:46,320 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 03:58:46,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:51,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 03:58:51,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 03:58:53,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:58:53,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:58:54,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:54,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 03:58:54,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:54,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:58:56,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:59,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:59,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:01,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:59:04,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:06,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:07,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:11,432 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=8.27 vs. limit=12.0 2023-09-30 03:59:12,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:59:14,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:14,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:14,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:17,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:20,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 03:59:20,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:59:20,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 03:59:20,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:59:24,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 03:59:25,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:27,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:35,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:35,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 03:59:36,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:59:37,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:59:39,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:59:42,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:59:46,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 03:59:46,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:59:46,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:59:49,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:49,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 03:59:50,550 INFO [train.py:1039] (2/4) Epoch 17, batch 3400, loss[loss=0.1879, simple_loss=0.2484, pruned_loss=0.0637, over 23342.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2576, pruned_loss=0.05414, over 4720065.33 frames. ], batch size: 285, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:59:50,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:50,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 03:59:52,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:52,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:53,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:59:55,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:59:55,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 03:59:58,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 03:59:58,803 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 03:59:58,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:04,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:00:04,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:00:05,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:07,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:00:10,139 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.964e+02 2.182e+02 2.544e+02 4.408e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 04:00:11,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=589360.0, ans=0.1 2023-09-30 04:00:13,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:17,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 04:00:17,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=589360.0, ans=0.125 2023-09-30 04:00:22,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:00:25,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:25,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:26,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:00:27,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=589426.6666666666, ans=0.125 2023-09-30 04:00:32,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:00:35,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=589426.6666666666, ans=0.1 2023-09-30 04:00:36,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 04:00:43,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:44,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:44,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 04:00:46,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:00:46,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:46,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:47,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:00:48,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=589493.3333333334, ans=0.1 2023-09-30 04:00:51,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:55,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:00:55,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:01:00,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:02,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 04:01:08,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:01:10,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=589560.0, ans=15.0 2023-09-30 04:01:12,417 INFO [train.py:1039] (2/4) Epoch 17, batch 3450, loss[loss=0.1893, simple_loss=0.2721, pruned_loss=0.05326, over 24566.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2576, pruned_loss=0.05406, over 4730236.14 frames. ], batch size: 71, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:01:14,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 04:01:14,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=589626.6666666666, ans=0.125 2023-09-30 04:01:19,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 04:01:19,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:01:20,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:01:20,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 04:01:22,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:27,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:01:29,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=589693.3333333334, ans=0.0 2023-09-30 04:01:32,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:01:33,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:33,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:01:33,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:33,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=589693.3333333334, ans=0.0 2023-09-30 04:01:35,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:42,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 04:01:48,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 04:01:48,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:01:48,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:01:50,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:52,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=589760.0, ans=0.125 2023-09-30 04:01:56,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 04:01:57,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:02:00,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:00,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:02:01,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:02:04,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:02:06,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 04:02:06,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:07,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.40 vs. limit=22.5 2023-09-30 04:02:07,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:02:09,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:02:13,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 04:02:17,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:02:21,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=589893.3333333334, ans=0.125 2023-09-30 04:02:22,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:02:23,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:26,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:26,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=589893.3333333334, ans=0.0 2023-09-30 04:02:31,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:31,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:33,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:02:33,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:33,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=589960.0, ans=0.125 2023-09-30 04:02:34,855 INFO [train.py:1039] (2/4) Epoch 17, batch 3500, loss[loss=0.1815, simple_loss=0.2513, pruned_loss=0.05583, over 23374.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2558, pruned_loss=0.05343, over 4733489.43 frames. ], batch size: 119, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:02:38,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:38,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=589960.0, ans=0.125 2023-09-30 04:02:41,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:02:42,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 04:02:45,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:02:47,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:02:52,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:52,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 04:02:53,689 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.944e+02 2.191e+02 2.533e+02 4.328e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-30 04:02:57,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:02:59,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:03:00,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:03:00,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:00,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:03:00,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:00,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:02,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 04:03:05,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:05,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:03:06,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=590093.3333333334, ans=0.125 2023-09-30 04:03:08,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:10,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:11,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 04:03:12,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:14,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:16,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=590093.3333333334, ans=0.0 2023-09-30 04:03:16,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=590093.3333333334, ans=0.1 2023-09-30 04:03:17,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:03:17,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:19,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:03:19,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:21,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 04:03:21,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 04:03:21,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 04:03:22,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:22,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=590160.0, ans=0.125 2023-09-30 04:03:25,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:26,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:26,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:03:26,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=590160.0, ans=0.0 2023-09-30 04:03:30,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:03:31,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:03:34,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=590160.0, ans=0.0 2023-09-30 04:03:36,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:03:38,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 04:03:38,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 04:03:38,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:03:39,623 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.40 vs. limit=22.5 2023-09-30 04:03:42,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:42,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:44,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:47,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 04:03:48,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:48,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:50,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 04:03:53,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 04:03:54,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:56,840 INFO [train.py:1039] (2/4) Epoch 17, batch 3550, loss[loss=0.1773, simple_loss=0.2636, pruned_loss=0.04552, over 24684.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2552, pruned_loss=0.05312, over 4732698.14 frames. ], batch size: 68, lr: 6.04e-03, grad_scale: 8.0 2023-09-30 04:03:56,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:56,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:03:57,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:02,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:04:07,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.30 vs. limit=15.0 2023-09-30 04:04:09,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:13,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:04:16,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:17,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=590360.0, ans=0.09899494936611666 2023-09-30 04:04:18,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:04:18,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=590360.0, ans=0.1 2023-09-30 04:04:21,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:21,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:04:21,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:04:26,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:26,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:04:26,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:26,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:04:27,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:04:33,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:04:33,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:34,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:34,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:34,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:04:34,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 04:04:34,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:37,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:37,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 04:04:43,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:45,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:47,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:48,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 04:04:49,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=590493.3333333334, ans=0.0 2023-09-30 04:04:50,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:04:51,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 04:04:52,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:54,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:04:56,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:04:58,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 04:04:58,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:04,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:04,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 04:05:06,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:11,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:05:12,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 04:05:17,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 04:05:19,788 INFO [train.py:1039] (2/4) Epoch 17, batch 3600, loss[loss=0.1824, simple_loss=0.2693, pruned_loss=0.04775, over 24372.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2547, pruned_loss=0.05296, over 4723222.47 frames. ], batch size: 77, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:05:19,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:05:21,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:05:23,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:24,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:25,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:05:30,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:31,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:33,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:05:33,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:05:33,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:33,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 04:05:38,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:05:39,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:41,049 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.895e+02 2.111e+02 2.493e+02 3.633e+02, threshold=4.223e+02, percent-clipped=0.0 2023-09-30 04:05:42,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:44,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:05:45,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=590693.3333333334, ans=0.1 2023-09-30 04:05:46,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:05:47,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:47,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 04:05:49,306 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:52,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:53,021 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.28 vs. limit=15.0 2023-09-30 04:05:53,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:05:57,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:59,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:06:00,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:02,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 04:06:10,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:10,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:06:11,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 04:06:12,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=590826.6666666666, ans=0.2 2023-09-30 04:06:15,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:06:20,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:23,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:29,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:06:29,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:06:29,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 04:06:33,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 04:06:35,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 04:06:36,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:38,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:06:39,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 04:06:39,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:06:41,278 INFO [train.py:1039] (2/4) Epoch 17, batch 3650, loss[loss=0.1763, simple_loss=0.2666, pruned_loss=0.04301, over 24636.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2552, pruned_loss=0.05309, over 4725335.49 frames. ], batch size: 68, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:06:41,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:06:41,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:41,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 04:06:42,217 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.96 vs. limit=15.0 2023-09-30 04:06:42,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 04:06:46,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:46,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=590960.0, ans=0.0 2023-09-30 04:06:47,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 04:06:52,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 04:06:55,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:06:59,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 04:07:00,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 04:07:05,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:05,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:07:06,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:07:10,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 04:07:10,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:07:12,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 04:07:13,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:07:13,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:13,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 04:07:15,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:07:16,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:16,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:18,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:07:19,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 04:07:21,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 04:07:21,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:07:24,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 04:07:25,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:25,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:07:26,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=591093.3333333334, ans=0.1 2023-09-30 04:07:30,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=591160.0, ans=0.07 2023-09-30 04:07:31,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:07:33,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:33,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:07:34,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:07:35,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:07:35,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=591160.0, ans=0.0 2023-09-30 04:07:38,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:07:41,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:43,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:43,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:44,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:07:44,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:46,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:51,358 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 04:07:54,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:54,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:55,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=591226.6666666666, ans=0.125 2023-09-30 04:07:56,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:07:56,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:07:58,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:07:58,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:59,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 04:07:59,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:03,221 INFO [train.py:1039] (2/4) Epoch 17, batch 3700, loss[loss=0.1899, simple_loss=0.2715, pruned_loss=0.05415, over 24420.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2564, pruned_loss=0.05388, over 4718475.50 frames. ], batch size: 77, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:08:03,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:08:04,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:08:06,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:08:10,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:10,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 04:08:10,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:10,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:08:10,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:08:10,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=591293.3333333334, ans=0.125 2023-09-30 04:08:16,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:08:20,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:08:20,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:22,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:08:23,657 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.066e+02 2.372e+02 2.822e+02 4.453e+02, threshold=4.744e+02, percent-clipped=1.0 2023-09-30 04:08:23,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:23,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:08:24,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.06 vs. limit=15.0 2023-09-30 04:08:25,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:29,070 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 04:08:35,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:08:35,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:08:35,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:08:35,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 04:08:37,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:40,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:41,326 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=12.0 2023-09-30 04:08:42,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 04:08:42,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:45,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:08:47,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:47,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:08:49,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:08:52,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:52,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 04:08:54,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:55,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 04:09:02,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:09:02,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:09:05,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:06,181 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.17 vs. limit=15.0 2023-09-30 04:09:06,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 04:09:08,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:09:08,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:09:08,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:08,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:13,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:15,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 04:09:16,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 04:09:16,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:09:16,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:17,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=591560.0, ans=0.2 2023-09-30 04:09:19,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:09:19,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:09:22,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:09:22,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:09:24,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:09:25,809 INFO [train.py:1039] (2/4) Epoch 17, batch 3750, loss[loss=0.1837, simple_loss=0.2498, pruned_loss=0.05877, over 22691.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.258, pruned_loss=0.05443, over 4721154.34 frames. ], batch size: 322, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:09:26,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 04:09:28,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:09:31,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:09:31,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 04:09:33,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:09:34,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:35,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:38,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:09:38,876 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=591626.6666666666, ans=0.1 2023-09-30 04:09:41,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:09:46,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:09:46,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:09:49,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:52,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:09:53,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 04:09:55,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:09:56,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:09:56,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:09:59,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 04:10:06,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 04:10:06,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:10:06,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:10:09,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:14,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:15,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:10:21,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 04:10:25,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:26,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:10:28,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:10:30,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=591893.3333333334, ans=0.125 2023-09-30 04:10:30,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=591893.3333333334, ans=0.04949747468305833 2023-09-30 04:10:31,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:10:36,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:10:38,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:10:40,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:10:41,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:10:45,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:10:46,643 INFO [train.py:1039] (2/4) Epoch 17, batch 3800, loss[loss=0.1999, simple_loss=0.2534, pruned_loss=0.07321, over 22706.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2576, pruned_loss=0.05445, over 4717442.47 frames. ], batch size: 322, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:10:52,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:10:57,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:58,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:11:00,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 04:11:01,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:03,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:05,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:11:06,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:11:06,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:06,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:11:08,671 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.935e+02 2.193e+02 2.576e+02 3.771e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-30 04:11:10,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:11,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:11:11,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:11,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 04:11:12,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=592026.6666666666, ans=0.125 2023-09-30 04:11:16,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:11:16,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:11:18,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:20,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:11:20,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:11:23,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:11:23,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:26,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:27,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:34,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:11:34,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 04:11:36,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:43,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:11:47,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=592160.0, ans=0.07 2023-09-30 04:11:48,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:11:50,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 04:11:50,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=592160.0, ans=0.125 2023-09-30 04:11:52,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 04:11:53,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:55,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:55,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:58,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 04:11:58,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=592226.6666666666, ans=0.0 2023-09-30 04:12:01,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 04:12:01,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 04:12:01,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:03,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:12:08,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:12:10,431 INFO [train.py:1039] (2/4) Epoch 17, batch 3850, loss[loss=0.1768, simple_loss=0.2242, pruned_loss=0.06472, over 19438.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2563, pruned_loss=0.05441, over 4699188.70 frames. ], batch size: 388, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:12:10,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:12:15,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:12:15,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=592293.3333333334, ans=0.015 2023-09-30 04:12:18,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 04:12:18,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:12:20,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:23,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:12:27,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:30,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:12:30,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 04:12:34,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:36,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:40,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:40,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:12:42,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:43,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:12:44,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:44,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:12:45,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:49,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:50,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:50,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:12:51,862 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.01 vs. limit=22.5 2023-09-30 04:12:52,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 04:12:52,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 04:12:53,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:55,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:56,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:12:58,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:59,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 04:13:02,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 04:13:03,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:05,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 04:13:07,228 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.59 vs. limit=15.0 2023-09-30 04:13:08,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:13:11,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:13,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:13:18,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:18,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 04:13:20,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 04:13:23,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:23,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:26,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:13:26,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:13:27,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:13:27,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 04:13:30,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:13:30,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 04:13:30,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:30,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:31,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:13:31,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:31,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:13:33,362 INFO [train.py:1039] (2/4) Epoch 17, batch 3900, loss[loss=0.1815, simple_loss=0.2558, pruned_loss=0.05357, over 20283.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.255, pruned_loss=0.05366, over 4704917.20 frames. ], batch size: 44, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:13:33,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:33,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:35,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:13:35,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 04:13:35,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:38,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:39,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:39,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:13:41,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:44,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:44,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:48,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:13:49,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 04:13:49,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:13:51,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 04:13:51,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:53,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 04:13:54,483 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.932e+02 2.197e+02 2.548e+02 3.814e+02, threshold=4.393e+02, percent-clipped=0.0 2023-09-30 04:13:54,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 04:13:59,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:03,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:14:03,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:14:04,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:07,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:09,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:14:10,270 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.12 vs. limit=12.0 2023-09-30 04:14:11,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:14:11,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:12,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:14:19,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:19,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:14:21,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=592826.6666666666, ans=0.1 2023-09-30 04:14:28,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:14:30,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:14:40,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:14:43,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:45,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 04:14:45,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 04:14:45,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:46,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 04:14:48,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:49,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 04:14:50,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=592893.3333333334, ans=0.125 2023-09-30 04:14:53,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=592960.0, ans=0.07 2023-09-30 04:14:54,781 INFO [train.py:1039] (2/4) Epoch 17, batch 3950, loss[loss=0.1934, simple_loss=0.2665, pruned_loss=0.06014, over 24044.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2551, pruned_loss=0.0536, over 4712150.47 frames. ], batch size: 80, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:14:55,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=592960.0, ans=0.125 2023-09-30 04:14:58,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:58,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 04:15:00,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:15:03,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:15:04,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:15:11,378 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 04:15:11,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:13,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 04:15:13,665 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 04:15:15,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:16,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=593026.6666666666, ans=0.2 2023-09-30 04:15:18,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:18,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:15:19,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:22,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 04:15:24,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:15:24,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:24,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:15:25,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:15:25,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:15:36,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:15:36,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:15:40,472 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=15.0 2023-09-30 04:15:43,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 04:15:48,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 04:15:48,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 04:15:48,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:15:48,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:15:56,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:15:56,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:15:56,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:57,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:15:57,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 04:16:05,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:16:05,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=593226.6666666666, ans=0.125 2023-09-30 04:16:06,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:16:11,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 04:16:17,998 INFO [train.py:1039] (2/4) Epoch 17, batch 4000, loss[loss=0.1706, simple_loss=0.2578, pruned_loss=0.04177, over 24292.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2562, pruned_loss=0.05371, over 4718144.81 frames. ], batch size: 74, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:16:19,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:22,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=593293.3333333334, ans=0.0 2023-09-30 04:16:28,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:32,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:34,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:16:34,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:34,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 04:16:36,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:16:38,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 04:16:38,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:16:38,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 04:16:40,242 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.856e+02 2.093e+02 2.279e+02 3.915e+02, threshold=4.185e+02, percent-clipped=0.0 2023-09-30 04:16:40,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:43,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:16:43,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:16:43,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:16:43,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:16:43,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:16:46,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:16:47,835 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 04:16:49,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:16:51,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:16:53,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=593426.6666666666, ans=0.125 2023-09-30 04:16:54,375 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 04:16:55,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:16:55,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:17:04,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 04:17:05,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:17:08,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:17:10,185 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 04:17:10,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:17:10,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 04:17:12,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:17:12,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:15,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:17:16,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:17:16,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:17:16,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:17:18,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 04:17:18,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:22,305 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 04:17:26,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:17:30,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:17:32,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:17:34,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:35,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:17:35,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:17:39,922 INFO [train.py:1039] (2/4) Epoch 17, batch 4050, loss[loss=0.1759, simple_loss=0.2625, pruned_loss=0.0446, over 24516.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2569, pruned_loss=0.05398, over 4727926.24 frames. ], batch size: 66, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:17:40,244 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=593626.6666666666, ans=0.1 2023-09-30 04:17:41,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:44,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:17:45,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 04:17:46,322 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=593626.6666666666, ans=0.2 2023-09-30 04:17:47,020 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.25 vs. limit=22.5 2023-09-30 04:17:47,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:17:47,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:17:49,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:17:51,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:17:52,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:56,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:58,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=593693.3333333334, ans=0.125 2023-09-30 04:17:59,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:17:59,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=593693.3333333334, ans=0.125 2023-09-30 04:18:00,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:18:02,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:18:03,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:18:07,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:09,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:18:12,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 04:18:13,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=593760.0, ans=0.0 2023-09-30 04:18:14,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 04:18:14,534 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 04:18:17,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:18:22,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 04:18:23,469 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.96 vs. limit=22.5 2023-09-30 04:18:25,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:27,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:32,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:32,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:18:32,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:35,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:18:38,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 04:18:38,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:18:42,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:43,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 04:18:44,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=593893.3333333334, ans=0.125 2023-09-30 04:18:49,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:49,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=593893.3333333334, ans=0.125 2023-09-30 04:18:55,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 04:18:56,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:56,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:18:57,513 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.12 vs. limit=15.0 2023-09-30 04:19:00,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 04:19:00,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 04:19:00,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:01,687 INFO [train.py:1039] (2/4) Epoch 17, batch 4100, loss[loss=0.1979, simple_loss=0.2768, pruned_loss=0.05955, over 24574.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2576, pruned_loss=0.05391, over 4725929.11 frames. ], batch size: 71, lr: 6.02e-03, grad_scale: 32.0 2023-09-30 04:19:01,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:03,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:03,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:19:05,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=593960.0, ans=0.125 2023-09-30 04:19:09,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 04:19:09,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 04:19:11,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 04:19:11,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 04:19:11,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:13,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:19:14,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=593960.0, ans=0.2 2023-09-30 04:19:15,169 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 04:19:20,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:20,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:19:20,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:21,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:19:24,773 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.800e+02 1.954e+02 2.140e+02 3.094e+02, threshold=3.909e+02, percent-clipped=0.0 2023-09-30 04:19:25,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:19:26,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:26,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:19:28,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 04:19:30,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:30,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:19:30,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:32,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:19:32,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 04:19:35,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:19:35,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 04:19:38,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:19:40,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:40,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 04:19:41,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:43,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:19:43,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:19:45,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=594093.3333333334, ans=0.09899494936611666 2023-09-30 04:19:46,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 04:19:49,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:19:51,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:19:53,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 04:19:55,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:55,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:19:58,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:00,309 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=594160.0, ans=0.1 2023-09-30 04:20:01,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:05,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:07,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:20:13,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:13,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:16,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:20,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:20:20,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=594226.6666666666, ans=0.125 2023-09-30 04:20:23,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:20:25,117 INFO [train.py:1039] (2/4) Epoch 17, batch 4150, loss[loss=0.1856, simple_loss=0.2602, pruned_loss=0.05546, over 23242.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2575, pruned_loss=0.05401, over 4719294.60 frames. ], batch size: 105, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:20:25,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:20:25,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:20:25,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:29,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 04:20:29,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:30,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 04:20:30,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 04:20:30,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 04:20:33,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:37,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:20:37,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:41,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:20:41,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:20:42,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:20:42,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:20:44,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:46,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:20:50,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:50,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=594360.0, ans=0.2 2023-09-30 04:20:56,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:20:57,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 04:20:59,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 04:20:59,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:21:01,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 04:21:01,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:21:01,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:04,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:05,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:10,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 04:21:11,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=594426.6666666666, ans=10.0 2023-09-30 04:21:14,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:15,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:21:17,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 04:21:17,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:21:20,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 04:21:20,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:21:22,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:25,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:26,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 04:21:26,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:21:26,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:21:28,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:21:29,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=594493.3333333334, ans=0.0 2023-09-30 04:21:30,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 04:21:30,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:30,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:21:32,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:21:32,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=594560.0, ans=0.1 2023-09-30 04:21:34,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 04:21:34,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:34,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:21:35,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:21:37,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:37,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 04:21:38,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:41,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.39 vs. limit=15.0 2023-09-30 04:21:45,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:21:47,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 04:21:48,365 INFO [train.py:1039] (2/4) Epoch 17, batch 4200, loss[loss=0.189, simple_loss=0.2547, pruned_loss=0.06161, over 23766.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2568, pruned_loss=0.05364, over 4716404.11 frames. ], batch size: 212, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:21:48,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:21:52,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:21:53,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:21:55,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:55,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:56,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 04:22:00,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 04:22:00,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:03,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:08,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:22:12,240 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.960e+02 2.323e+02 2.795e+02 4.279e+02, threshold=4.646e+02, percent-clipped=2.0 2023-09-30 04:22:12,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:22:15,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:15,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:16,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 04:22:16,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:18,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:18,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:22:18,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:22:20,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:22:23,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 04:22:23,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:27,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:22:29,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:22:29,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:22:32,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:22:34,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:22:34,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 04:22:34,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=594760.0, ans=0.0 2023-09-30 04:22:34,852 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.51 vs. limit=15.0 2023-09-30 04:22:35,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:22:35,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:22:40,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-09-30 04:22:41,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:22:44,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:44,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=594826.6666666666, ans=0.0 2023-09-30 04:22:49,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:22:52,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 04:22:54,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:23:00,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:23:00,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:03,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 04:23:04,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=594893.3333333334, ans=0.1 2023-09-30 04:23:07,985 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.55 vs. limit=15.0 2023-09-30 04:23:10,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:23:12,284 INFO [train.py:1039] (2/4) Epoch 17, batch 4250, loss[loss=0.1917, simple_loss=0.2606, pruned_loss=0.06143, over 23509.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.256, pruned_loss=0.0535, over 4714896.68 frames. ], batch size: 134, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:23:15,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:23:15,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:23:17,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:23,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:23:25,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 04:23:25,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:23:28,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:32,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:36,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:36,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:40,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:23:40,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:23:41,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:41,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:43,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:46,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:23:48,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:23:50,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 04:23:53,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 04:23:53,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:53,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:53,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:54,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:23:54,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:55,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:58,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:24:00,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:24:05,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:07,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:07,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 04:24:07,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:24:09,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 04:24:11,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:24:12,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:24:14,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:14,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=595160.0, ans=0.015 2023-09-30 04:24:15,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:24:17,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 04:24:19,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:24:19,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:24:25,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:28,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:30,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:24:30,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:31,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:33,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:24:34,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:24:34,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 04:24:35,510 INFO [train.py:1039] (2/4) Epoch 17, batch 4300, loss[loss=0.1787, simple_loss=0.2449, pruned_loss=0.05627, over 23632.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2557, pruned_loss=0.05339, over 4722664.98 frames. ], batch size: 135, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:24:35,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:40,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:40,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:24:46,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:55,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:55,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 04:24:55,978 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=595360.0, ans=0.125 2023-09-30 04:24:57,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:24:57,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=595360.0, ans=0.125 2023-09-30 04:24:57,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=595360.0, ans=0.125 2023-09-30 04:24:59,972 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.855e+02 2.115e+02 2.472e+02 4.142e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 04:25:00,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:25:00,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:25:00,153 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 04:25:03,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:25:04,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:06,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=595426.6666666666, ans=0.125 2023-09-30 04:25:08,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 04:25:08,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:25:09,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 04:25:11,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:25:15,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:25:18,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:25:18,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:25:19,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:25:21,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:23,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:25:23,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 04:25:24,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 04:25:26,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:25:29,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:29,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:25:29,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:29,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:30,380 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.81 vs. limit=15.0 2023-09-30 04:25:31,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 04:25:31,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 04:25:33,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 04:25:34,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:25:34,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 04:25:36,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 04:25:39,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:41,047 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 04:25:41,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:25:44,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:44,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:46,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 04:25:47,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:47,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:49,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:25:49,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:50,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=595560.0, ans=0.2 2023-09-30 04:25:51,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:25:53,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:25:53,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:54,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:54,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:55,222 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:25:57,830 INFO [train.py:1039] (2/4) Epoch 17, batch 4350, loss[loss=0.171, simple_loss=0.2629, pruned_loss=0.03953, over 24689.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2568, pruned_loss=0.05361, over 4732460.49 frames. ], batch size: 73, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:26:00,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 04:26:01,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:26:05,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=595626.6666666666, ans=0.125 2023-09-30 04:26:06,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:09,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:11,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:26:11,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:26:16,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:26:16,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=595693.3333333334, ans=0.125 2023-09-30 04:26:20,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:24,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:26:24,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:26:27,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:26:30,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:26:32,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:26:37,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 04:26:39,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:41,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:44,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:47,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 04:26:51,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:26:51,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=595826.6666666666, ans=0.5 2023-09-30 04:26:52,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:26:56,212 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.10 vs. limit=15.0 2023-09-30 04:26:59,266 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 04:27:00,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:00,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:27:03,043 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=12.0 2023-09-30 04:27:03,793 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 04:27:03,915 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 04:27:03,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:03,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:05,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:27:05,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:07,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:07,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:10,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 04:27:10,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:10,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:10,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:11,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 04:27:13,939 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 04:27:13,947 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 04:27:13,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 04:27:15,716 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=595893.3333333334, ans=0.0 2023-09-30 04:27:18,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:27:19,883 INFO [train.py:1039] (2/4) Epoch 17, batch 4400, loss[loss=0.1784, simple_loss=0.2633, pruned_loss=0.04671, over 24542.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2574, pruned_loss=0.054, over 4736165.84 frames. ], batch size: 71, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:27:19,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:27:19,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:21,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:27:23,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 04:27:24,653 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 04:27:24,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:28,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:28,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:30,149 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:31,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 04:27:33,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 04:27:33,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 04:27:33,856 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 04:27:35,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:27:35,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:38,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 04:27:40,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:41,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:41,872 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 04:27:44,840 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.901e+02 2.144e+02 2.567e+02 3.604e+02, threshold=4.289e+02, percent-clipped=0.0 2023-09-30 04:27:45,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:45,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 04:27:46,419 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 04:27:48,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 04:27:48,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 04:27:50,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 04:27:50,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:52,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:54,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 04:27:54,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 04:27:55,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:57,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:27:57,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:59,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:01,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:28:01,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 04:28:01,151 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 04:28:05,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:13,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:28:16,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 04:28:20,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:28:23,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:27,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:28:27,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 04:28:27,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:28:27,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:28:27,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:28:28,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:28:29,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=596226.6666666666, ans=0.2 2023-09-30 04:28:32,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 04:28:32,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=596226.6666666666, ans=0.0 2023-09-30 04:28:32,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=596226.6666666666, ans=0.125 2023-09-30 04:28:32,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=596226.6666666666, ans=0.125 2023-09-30 04:28:32,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=596226.6666666666, ans=0.2 2023-09-30 04:28:35,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 04:28:35,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 04:28:36,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:28:37,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 04:28:37,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=596226.6666666666, ans=0.125 2023-09-30 04:28:38,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:28:40,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:28:43,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.32 vs. limit=22.5 2023-09-30 04:28:44,017 INFO [train.py:1039] (2/4) Epoch 17, batch 4450, loss[loss=0.1515, simple_loss=0.2271, pruned_loss=0.03789, over 24431.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2581, pruned_loss=0.05484, over 4712164.22 frames. ], batch size: 58, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:28:44,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 04:28:47,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:48,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:48,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:28:55,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:28:57,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:29:00,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:03,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:29:04,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:29:04,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:07,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 04:29:07,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:07,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:07,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:07,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:29:10,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:29:16,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:18,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:18,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:29:22,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:29:24,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 04:29:24,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 04:29:24,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:29:26,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=596426.6666666666, ans=0.125 2023-09-30 04:29:29,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:30,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 04:29:34,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:29:40,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:41,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 04:29:41,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:41,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:41,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:29:41,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:42,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=596493.3333333334, ans=0.125 2023-09-30 04:29:43,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:47,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:29:47,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 04:29:50,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:29:51,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:53,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:54,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:56,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:29:59,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:30:01,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 04:30:02,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:30:05,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=596560.0, ans=0.0 2023-09-30 04:30:07,622 INFO [train.py:1039] (2/4) Epoch 17, batch 4500, loss[loss=0.1586, simple_loss=0.2356, pruned_loss=0.04083, over 24298.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2582, pruned_loss=0.05561, over 4696566.51 frames. ], batch size: 56, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:30:07,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:09,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 04:30:09,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 04:30:11,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:12,172 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.73 vs. limit=15.0 2023-09-30 04:30:19,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:30:19,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:21,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:30:23,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:30:23,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:23,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:32,266 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.892e+02 2.101e+02 2.322e+02 3.249e+02, threshold=4.203e+02, percent-clipped=0.0 2023-09-30 04:30:35,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:35,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:30:38,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:30:39,726 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.60 vs. limit=15.0 2023-09-30 04:30:40,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:30:41,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:30:47,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=596760.0, ans=0.125 2023-09-30 04:30:49,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:30:52,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:30:56,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:31:00,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:31:01,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 04:31:01,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:03,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:04,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:06,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:31:06,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=596826.6666666666, ans=0.1 2023-09-30 04:31:07,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:31:07,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 04:31:07,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:31:07,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:13,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:31:14,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:31:18,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:22,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:31:22,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:31:25,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 04:31:25,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 04:31:25,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 04:31:26,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=596893.3333333334, ans=0.125 2023-09-30 04:31:30,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 04:31:31,541 INFO [train.py:1039] (2/4) Epoch 17, batch 4550, loss[loss=0.1538, simple_loss=0.2333, pruned_loss=0.03716, over 20306.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2578, pruned_loss=0.05539, over 4698869.55 frames. ], batch size: 44, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:31:34,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 04:31:35,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:38,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:39,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:43,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:48,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:31:48,361 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=597026.6666666666, ans=0.2 2023-09-30 04:31:49,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:51,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:31:51,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:31:51,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:53,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:53,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:57,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:32:00,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 04:32:01,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 04:32:03,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:32:04,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 04:32:06,190 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.86 vs. limit=10.0 2023-09-30 04:32:07,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 04:32:08,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:11,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 04:32:11,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=597093.3333333334, ans=0.0 2023-09-30 04:32:13,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:32:14,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:32:18,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 04:32:21,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:24,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:25,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:26,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:26,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 04:32:28,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 04:32:28,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:32:30,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 04:32:31,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 04:32:31,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:35,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:35,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:32:37,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:37,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:32:39,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:32:39,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=597226.6666666666, ans=0.125 2023-09-30 04:32:39,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=597226.6666666666, ans=0.0 2023-09-30 04:32:40,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 04:32:42,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:43,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:32:43,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 04:32:43,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:32:43,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 04:32:46,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:32:46,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:32:50,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:32:50,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:50,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:32:52,981 INFO [train.py:1039] (2/4) Epoch 17, batch 4600, loss[loss=0.1624, simple_loss=0.2356, pruned_loss=0.04461, over 24309.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2563, pruned_loss=0.05489, over 4703550.31 frames. ], batch size: 56, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:32:53,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:32:54,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:32:57,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:59,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:33:03,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:33:03,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:33:05,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:05,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 04:33:08,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:33:13,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:33:13,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:17,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:18,330 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.818e+02 2.016e+02 2.179e+02 3.781e+02, threshold=4.032e+02, percent-clipped=0.0 2023-09-30 04:33:23,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 04:33:23,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:26,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:26,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=597426.6666666666, ans=0.125 2023-09-30 04:33:29,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:33:30,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:36,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 04:33:36,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:33:38,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:33:42,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:44,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:33:46,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:33:46,634 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=597493.3333333334, ans=0.0 2023-09-30 04:33:48,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.80 vs. limit=15.0 2023-09-30 04:33:50,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 04:33:51,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:33:53,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=597493.3333333334, ans=0.2 2023-09-30 04:33:56,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:33:57,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:33:57,898 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=597560.0, ans=0.0 2023-09-30 04:34:00,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 04:34:00,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:00,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 04:34:00,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:03,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:03,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:34:04,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:05,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 04:34:06,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 04:34:06,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 04:34:06,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:09,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:09,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:11,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:15,485 INFO [train.py:1039] (2/4) Epoch 17, batch 4650, loss[loss=0.1903, simple_loss=0.2564, pruned_loss=0.06212, over 23548.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2559, pruned_loss=0.0545, over 4710491.64 frames. ], batch size: 135, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:34:19,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=597626.6666666666, ans=0.125 2023-09-30 04:34:20,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:34:26,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:26,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:27,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:34:27,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:27,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:27,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:32,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 04:34:35,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:34:38,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 04:34:38,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:40,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 04:34:40,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:34:40,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 04:34:42,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 04:34:42,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:43,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:34:45,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:34:47,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:47,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 04:34:49,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=597760.0, ans=0.0 2023-09-30 04:34:50,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:52,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 04:34:56,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:56,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:34:56,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 04:34:59,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:00,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:35:05,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:09,741 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.08 vs. limit=22.5 2023-09-30 04:35:10,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:13,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:15,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:15,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:35:18,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 04:35:18,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 04:35:20,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 04:35:20,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 04:35:22,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:26,245 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.35 vs. limit=15.0 2023-09-30 04:35:30,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:35:30,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:31,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 04:35:31,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:32,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:32,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:35:34,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:35:35,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:35:35,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:36,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:39,567 INFO [train.py:1039] (2/4) Epoch 17, batch 4700, loss[loss=0.2009, simple_loss=0.2672, pruned_loss=0.06726, over 23718.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2567, pruned_loss=0.05471, over 4717947.28 frames. ], batch size: 232, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:35:39,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:39,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:35:40,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.81 vs. limit=6.0 2023-09-30 04:35:41,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:35:42,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 04:35:44,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:35:44,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 04:35:52,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:55,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:55,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:56,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:59,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:36:03,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 04:36:03,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 04:36:04,678 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.407e+02 1.853e+02 2.008e+02 2.274e+02 4.210e+02, threshold=4.016e+02, percent-clipped=1.0 2023-09-30 04:36:06,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:08,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:36:08,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:36:13,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:17,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:36:19,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:36:22,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:36:31,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=598160.0, ans=0.0 2023-09-30 04:36:32,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 04:36:33,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:36:35,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:38,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 04:36:38,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:36:43,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:36:43,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 04:36:45,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=598226.6666666666, ans=0.1 2023-09-30 04:36:46,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:46,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:50,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:50,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:36:51,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 04:36:52,036 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 04:36:54,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:57,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 04:36:57,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:58,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=598226.6666666666, ans=0.125 2023-09-30 04:37:00,899 INFO [train.py:1039] (2/4) Epoch 17, batch 4750, loss[loss=0.185, simple_loss=0.2631, pruned_loss=0.05351, over 24643.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2568, pruned_loss=0.05449, over 4714220.71 frames. ], batch size: 65, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:37:03,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 04:37:06,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:37:06,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:08,536 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:37:12,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:12,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:37:15,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 04:37:16,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:19,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 04:37:21,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:37:21,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:37:22,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:26,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 04:37:28,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=598360.0, ans=0.125 2023-09-30 04:37:32,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:37:34,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 04:37:36,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:37,050 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=598426.6666666666, ans=0.025 2023-09-30 04:37:39,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:39,835 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 04:37:39,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 04:37:46,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 04:37:48,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:51,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:37:54,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:37:54,650 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 04:37:54,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:37:57,089 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=12.0 2023-09-30 04:37:59,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:38:00,127 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=598493.3333333334, ans=0.125 2023-09-30 04:38:01,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:38:03,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 04:38:03,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 04:38:03,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:03,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=598493.3333333334, ans=0.0 2023-09-30 04:38:03,792 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.10 vs. limit=15.0 2023-09-30 04:38:04,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:38:04,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:06,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:38:06,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 04:38:09,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 04:38:11,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:13,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.92 vs. limit=15.0 2023-09-30 04:38:15,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:38:15,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 04:38:16,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:19,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:19,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=598560.0, ans=0.2 2023-09-30 04:38:21,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:38:21,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:22,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:38:24,080 INFO [train.py:1039] (2/4) Epoch 17, batch 4800, loss[loss=0.2047, simple_loss=0.268, pruned_loss=0.07064, over 23942.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2575, pruned_loss=0.05436, over 4723352.18 frames. ], batch size: 195, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:38:26,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:26,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 04:38:26,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 04:38:28,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 04:38:31,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:38:32,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:34,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 04:38:39,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:39,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:44,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:38:47,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:47,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:47,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 04:38:49,112 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.366e+02 1.827e+02 2.030e+02 2.375e+02 4.462e+02, threshold=4.061e+02, percent-clipped=1.0 2023-09-30 04:38:49,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:49,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:38:51,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:38:54,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:56,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:56,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:38:57,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:57,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:38:57,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:59,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:00,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=598760.0, ans=0.125 2023-09-30 04:39:02,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:05,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:39:07,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:39:09,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:11,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 04:39:11,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 04:39:12,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:12,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:39:12,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:39:12,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:12,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:39:16,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:39:16,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:19,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:23,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:23,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:28,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 04:39:29,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:29,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:29,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:39:30,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:33,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:35,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:39:35,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:36,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:39:37,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:39:37,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:39:39,345 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.53 vs. limit=15.0 2023-09-30 04:39:42,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:43,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:43,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:45,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 04:39:48,819 INFO [train.py:1039] (2/4) Epoch 17, batch 4850, loss[loss=0.1641, simple_loss=0.2365, pruned_loss=0.04586, over 24319.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2586, pruned_loss=0.05521, over 4698492.15 frames. ], batch size: 56, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:39:48,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 04:39:48,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:48,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:49,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:39:49,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:52,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:40:01,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 04:40:03,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:09,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:11,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:40:11,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:14,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:16,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:40:17,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:40:17,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 04:40:21,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=599093.3333333334, ans=0.125 2023-09-30 04:40:23,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:40:24,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:40:26,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:40:26,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:40:26,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 04:40:29,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:29,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:33,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:33,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 04:40:33,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 04:40:35,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:40:42,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:40:43,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 04:40:45,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:40:45,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:40:48,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:40:48,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 04:40:48,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:49,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 04:40:49,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:50,858 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.93 vs. limit=15.0 2023-09-30 04:40:51,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:40:53,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 04:41:02,085 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=599226.6666666666, ans=0.125 2023-09-30 04:41:03,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:08,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:41:08,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:11,687 INFO [train.py:1039] (2/4) Epoch 17, batch 4900, loss[loss=0.1545, simple_loss=0.2329, pruned_loss=0.03805, over 24686.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2582, pruned_loss=0.0544, over 4718771.12 frames. ], batch size: 65, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:41:15,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 04:41:15,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:41:19,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:21,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:21,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:41:23,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.29 vs. limit=15.0 2023-09-30 04:41:24,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 04:41:30,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 04:41:32,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=599360.0, ans=0.05 2023-09-30 04:41:34,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 04:41:36,397 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.955e+02 2.270e+02 2.616e+02 3.974e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-30 04:41:36,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 04:41:36,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:36,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:36,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:41:36,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:36,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:41:38,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 04:41:38,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=599360.0, ans=0.125 2023-09-30 04:41:43,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 04:41:43,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:41:43,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=599426.6666666666, ans=0.04949747468305833 2023-09-30 04:41:44,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:41:45,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:46,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:41:48,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:48,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:48,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 04:41:50,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:41:52,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:52,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 04:41:52,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 04:41:56,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 04:41:58,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:41:59,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:41:59,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:42:01,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:01,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:42:01,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:42:03,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 04:42:05,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:07,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:42:09,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:42:15,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 04:42:15,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:42:15,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 04:42:15,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=599493.3333333334, ans=0.0 2023-09-30 04:42:16,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 04:42:21,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:23,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:42:25,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 04:42:25,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:26,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:42:28,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:29,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=599560.0, ans=0.05 2023-09-30 04:42:31,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:31,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:42:31,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:32,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:42:34,753 INFO [train.py:1039] (2/4) Epoch 17, batch 4950, loss[loss=0.1826, simple_loss=0.2701, pruned_loss=0.04758, over 24362.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2563, pruned_loss=0.05407, over 4706955.43 frames. ], batch size: 74, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:42:34,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:42:39,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:39,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:42,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 04:42:44,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 04:42:44,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:42:45,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 04:42:45,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:45,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:42:47,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:42:47,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:42:49,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:49,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:42:51,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:42:52,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:55,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:55,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:59,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:42:59,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=599693.3333333334, ans=0.0 2023-09-30 04:43:05,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:06,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:43:08,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:08,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:11,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:43:12,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 04:43:12,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 04:43:15,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:18,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:43:18,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:43:18,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=599760.0, ans=0.125 2023-09-30 04:43:20,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:43:21,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:43:21,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:43:25,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:26,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:43:28,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:43:31,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:31,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:31,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 04:43:33,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:43:35,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:43:36,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.58 vs. limit=10.0 2023-09-30 04:43:38,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:43:39,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:43:39,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:43:40,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:41,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:43:41,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:43:44,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:43:44,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:43:44,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:46,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 04:43:51,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:43:55,624 INFO [train.py:1039] (2/4) Epoch 17, batch 5000, loss[loss=0.1858, simple_loss=0.2555, pruned_loss=0.05809, over 23677.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2556, pruned_loss=0.05381, over 4715012.77 frames. ], batch size: 256, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:43:57,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 04:43:57,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:44:05,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:05,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:07,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 04:44:07,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 04:44:09,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:44:11,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 04:44:11,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:44:11,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:44:12,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 04:44:14,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:16,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:16,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 04:44:17,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:17,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:20,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 04:44:20,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 04:44:20,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:44:22,122 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.867e+02 2.091e+02 2.376e+02 3.821e+02, threshold=4.182e+02, percent-clipped=0.0 2023-09-30 04:44:22,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 04:44:22,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:44:23,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:23,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:44:23,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 04:44:23,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 04:44:26,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 04:44:27,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:27,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:29,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 04:44:29,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:32,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:33,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=600093.3333333334, ans=0.125 2023-09-30 04:44:34,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:34,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:44:36,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 04:44:36,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:44:39,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:44:41,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=600093.3333333334, ans=0.0 2023-09-30 04:44:42,443 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 04:44:44,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:46,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:46,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:44:48,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=600160.0, ans=0.2 2023-09-30 04:44:49,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 04:44:49,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:49,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=600160.0, ans=0.125 2023-09-30 04:44:51,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:51,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:53,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 04:44:53,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:56,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:58,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:03,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 04:45:06,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:12,200 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=15.0 2023-09-30 04:45:18,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:45:19,650 INFO [train.py:1039] (2/4) Epoch 17, batch 5050, loss[loss=0.1839, simple_loss=0.2466, pruned_loss=0.06058, over 23380.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2561, pruned_loss=0.05418, over 4714089.58 frames. ], batch size: 119, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:45:19,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:19,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:45:19,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:19,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:45:19,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:45:19,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 04:45:25,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:45:28,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:29,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:45:29,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 04:45:30,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=600293.3333333334, ans=0.0 2023-09-30 04:45:31,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:31,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:45:34,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:45:36,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:45:36,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:45:46,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 04:45:46,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:45:48,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:45:48,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 04:45:48,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:45:50,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:50,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:51,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:45:51,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 04:45:51,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 04:45:53,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:54,074 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=11.11 vs. limit=15.0 2023-09-30 04:45:55,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:00,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:46:00,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 04:46:03,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:05,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 04:46:07,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:46:07,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:46:07,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:08,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:46:11,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:13,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:46:15,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:15,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:46:15,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:46:15,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 04:46:15,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:46:16,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:46:20,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:20,257 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 04:46:20,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:46:23,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:25,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:25,341 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 04:46:28,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=600560.0, ans=0.1 2023-09-30 04:46:29,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:29,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 04:46:29,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:34,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:35,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:35,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 04:46:37,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 04:46:38,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:40,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:46:40,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:46:42,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 04:46:43,276 INFO [train.py:1039] (2/4) Epoch 17, batch 5100, loss[loss=0.1908, simple_loss=0.2756, pruned_loss=0.05297, over 24632.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2564, pruned_loss=0.05364, over 4734850.22 frames. ], batch size: 68, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:46:44,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:48,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 04:46:49,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 04:46:50,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=600626.6666666666, ans=0.0 2023-09-30 04:46:51,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:52,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:55,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:55,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 04:46:55,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 04:46:57,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=600693.3333333334, ans=0.0 2023-09-30 04:47:02,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:47:04,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:47:08,741 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.763e+02 1.981e+02 2.119e+02 3.450e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 04:47:08,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:47:09,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=600693.3333333334, ans=0.5 2023-09-30 04:47:11,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=600693.3333333334, ans=0.125 2023-09-30 04:47:12,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 04:47:14,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:16,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:47:16,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:47:19,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 04:47:22,862 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 04:47:22,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:24,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 04:47:24,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 04:47:26,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:37,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:47:39,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 04:47:39,523 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 04:47:39,536 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 04:47:41,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 04:47:41,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:44,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 04:47:44,491 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=600826.6666666666, ans=0.125 2023-09-30 04:47:50,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 04:47:53,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:47:54,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:47:56,467 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=600893.3333333334, ans=0.2 2023-09-30 04:47:56,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=600893.3333333334, ans=0.125 2023-09-30 04:47:57,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 04:47:59,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:47:59,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 04:48:02,153 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.71 vs. limit=6.0 2023-09-30 04:48:05,691 INFO [train.py:1039] (2/4) Epoch 17, batch 5150, loss[loss=0.2079, simple_loss=0.2804, pruned_loss=0.0677, over 23375.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2584, pruned_loss=0.05453, over 4720776.84 frames. ], batch size: 93, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:48:05,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:48:05,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:48:05,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:48:05,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:48:07,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:48:07,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:48:08,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 04:48:08,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 04:48:09,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 04:48:09,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:48:09,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 04:48:11,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:12,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 04:48:12,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:14,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:16,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=600960.0, ans=0.2 2023-09-30 04:48:20,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:48:20,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 04:48:22,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:22,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:48:26,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:48:26,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:26,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:26,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:48:26,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:48:27,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 04:48:27,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:48:29,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:48:31,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:48:32,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 04:48:34,126 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.54 vs. limit=10.0 2023-09-30 04:48:34,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:48:36,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=601026.6666666666, ans=0.0 2023-09-30 04:48:40,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:48:42,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 04:48:46,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:52,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:53,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:49:00,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:00,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 04:49:05,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=601160.0, ans=0.125 2023-09-30 04:49:06,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:49:06,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:49:08,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:49:11,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:11,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:13,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 04:49:18,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:49:20,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:49:23,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:49:23,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:49:23,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:49:24,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:49:24,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:49:25,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:49:27,923 INFO [train.py:1039] (2/4) Epoch 17, batch 5200, loss[loss=0.1878, simple_loss=0.2519, pruned_loss=0.0618, over 23534.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2587, pruned_loss=0.05478, over 4723302.76 frames. ], batch size: 120, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:49:29,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:49:31,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:49:35,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:37,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=601293.3333333334, ans=0.1 2023-09-30 04:49:40,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 04:49:42,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:49:44,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:45,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:47,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:49:47,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:47,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=601360.0, ans=0.1 2023-09-30 04:49:50,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 04:49:53,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:49:53,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:55,060 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.818e+02 1.970e+02 2.155e+02 3.146e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 04:49:55,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 04:49:57,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:49:57,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=601360.0, ans=0.2 2023-09-30 04:49:59,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:49:59,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 04:50:00,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 04:50:03,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 04:50:03,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:03,804 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 04:50:05,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:50:06,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:07,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:50:08,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 04:50:08,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:50:10,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:14,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 04:50:14,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 04:50:14,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 04:50:20,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 04:50:21,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:50:27,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:50:27,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:28,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 04:50:30,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:30,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 04:50:30,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:30,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:50:31,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.71 vs. limit=22.5 2023-09-30 04:50:33,863 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=601560.0, ans=0.125 2023-09-30 04:50:35,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:36,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:50:40,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:41,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:50:41,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:46,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:48,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 04:50:50,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:50,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:50:50,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:51,667 INFO [train.py:1039] (2/4) Epoch 17, batch 5250, loss[loss=0.1844, simple_loss=0.2574, pruned_loss=0.05577, over 23440.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.258, pruned_loss=0.05451, over 4707800.61 frames. ], batch size: 93, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:50:51,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:50:53,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:50:56,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:50:58,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=601626.6666666666, ans=0.04949747468305833 2023-09-30 04:51:00,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:00,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:51:01,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:51:07,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:51:07,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:51:07,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=601693.3333333334, ans=0.0 2023-09-30 04:51:08,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=601693.3333333334, ans=0.0 2023-09-30 04:51:11,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:51:13,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:51:15,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 04:51:15,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:18,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:51:32,700 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.98 vs. limit=22.5 2023-09-30 04:51:48,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=601826.6666666666, ans=0.2 2023-09-30 04:52:06,465 INFO [train.py:1039] (2/4) Epoch 17, batch 5300, loss[loss=0.1997, simple_loss=0.2701, pruned_loss=0.06467, over 23375.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2556, pruned_loss=0.05446, over 4701326.86 frames. ], batch size: 119, lr: 5.98e-03, grad_scale: 16.0 2023-09-30 04:52:10,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=601960.0, ans=0.0 2023-09-30 04:52:15,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=601960.0, ans=0.0 2023-09-30 04:52:16,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=601960.0, ans=0.125 2023-09-30 04:52:20,896 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.75 vs. limit=10.0 2023-09-30 04:52:21,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:52:21,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 04:52:21,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 04:52:21,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:22,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:22,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:22,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:22,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:22,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:52:22,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:23,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:52:23,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:52:23,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 04:52:23,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 04:52:23,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 04:52:23,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:52:24,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 04:52:24,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 04:52:24,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:24,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:24,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:25,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:25,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:52:26,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:26,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:26,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:26,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:26,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:26,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:52:26,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:26,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:52:27,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 04:52:27,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:27,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:27,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 04:52:27,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 04:52:28,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:52:28,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:52:28,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 04:52:28,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 04:52:28,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:29,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:52:29,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:30,047 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 04:52:30,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 04:52:30,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:52:30,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:30,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 04:52:30,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 04:52:30,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 04:52:30,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:39,885 INFO [train.py:1039] (2/4) Epoch 18, batch 0, loss[loss=0.1627, simple_loss=0.2444, pruned_loss=0.04049, over 24305.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2444, pruned_loss=0.04049, over 24305.00 frames. ], batch size: 61, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:52:39,885 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 04:52:53,311 INFO [train.py:1071] (2/4) Epoch 18, validation: loss=0.3168, simple_loss=0.2865, pruned_loss=0.1735, over 1125622.00 frames. 2023-09-30 04:52:53,312 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 04:52:57,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 04:52:58,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:52:58,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=602040.0, ans=0.1 2023-09-30 04:53:00,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:53:00,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=602040.0, ans=0.0 2023-09-30 04:53:01,503 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.872e+02 2.065e+02 2.362e+02 3.138e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-30 04:53:03,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=602040.0, ans=0.015 2023-09-30 04:53:06,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:06,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:53:06,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:08,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 04:53:09,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 04:53:11,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:13,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:16,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:53:16,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:19,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 04:53:19,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:27,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=602173.3333333334, ans=0.0 2023-09-30 04:53:29,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:53:29,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:31,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 04:53:34,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:53:34,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:53:36,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:37,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=602173.3333333334, ans=0.2 2023-09-30 04:53:41,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:53:44,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:51,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 04:53:54,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 04:53:54,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:53:54,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:53:57,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:53:59,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:01,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 04:54:03,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:05,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:07,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=602306.6666666666, ans=0.0 2023-09-30 04:54:10,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:13,570 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 04:54:13,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=602373.3333333334, ans=0.0 2023-09-30 04:54:15,520 INFO [train.py:1039] (2/4) Epoch 18, batch 50, loss[loss=0.175, simple_loss=0.2631, pruned_loss=0.04349, over 24567.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2559, pruned_loss=0.05171, over 1064608.61 frames. ], batch size: 71, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:54:17,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:54:20,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:20,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:54:21,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 04:54:21,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:54:21,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:54:23,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:24,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:26,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:29,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 04:54:29,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:37,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:54:37,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 04:54:41,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 04:54:43,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:54:44,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:54:44,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:46,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:48,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:54:48,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:54:48,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:57,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:54:58,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:58,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:55:00,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 04:55:03,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:55:03,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:55:03,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 04:55:03,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:06,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 04:55:10,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=602573.3333333334, ans=0.125 2023-09-30 04:55:13,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:15,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:55:15,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:16,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:18,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:21,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 04:55:21,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 04:55:21,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:21,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:24,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:55:24,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:26,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 04:55:28,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 04:55:28,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:55:29,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:31,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:55:31,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 04:55:31,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 04:55:32,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:34,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:34,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=602706.6666666666, ans=0.2 2023-09-30 04:55:35,735 INFO [train.py:1039] (2/4) Epoch 18, batch 100, loss[loss=0.1822, simple_loss=0.2677, pruned_loss=0.04838, over 24428.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2585, pruned_loss=0.0532, over 1864933.11 frames. ], batch size: 69, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:55:35,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:55:35,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:55:36,190 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:55:38,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:55:42,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:55:44,968 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.863e+02 2.072e+02 2.465e+02 3.411e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-30 04:55:45,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:55:47,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 04:55:47,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:51,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:55:51,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:51,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:51,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:51,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:52,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 04:55:54,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:55:55,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:55,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:55,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:56:00,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 04:56:01,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:01,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=602773.3333333334, ans=0.125 2023-09-30 04:56:02,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:04,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:56:05,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:56:10,194 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 04:56:10,220 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 04:56:11,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:11,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:56:12,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=602840.0, ans=0.0 2023-09-30 04:56:15,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:56:16,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:18,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:24,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:25,638 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 04:56:27,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:56:30,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:56:31,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:56:33,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=602906.6666666666, ans=0.0 2023-09-30 04:56:35,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:38,119 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.80 vs. limit=15.0 2023-09-30 04:56:38,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:40,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:56:43,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:56:46,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:47,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:49,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:49,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:56:51,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:51,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 04:56:51,489 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 04:56:51,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:53,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:56:53,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:53,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:53,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:56:55,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:56:55,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:56:55,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:56,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:56,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:57,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:56:58,450 INFO [train.py:1039] (2/4) Epoch 18, batch 150, loss[loss=0.1904, simple_loss=0.2566, pruned_loss=0.06208, over 23498.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2574, pruned_loss=0.05228, over 2516228.65 frames. ], batch size: 285, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:56:58,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:57:02,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:05,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:57:05,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:05,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:08,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:08,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:10,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=603040.0, ans=0.0 2023-09-30 04:57:10,658 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.36 vs. limit=22.5 2023-09-30 04:57:11,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:57:11,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:16,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 04:57:16,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 04:57:16,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 04:57:19,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:57:19,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:57:21,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:57:23,184 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:57:23,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:24,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:24,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:26,171 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 04:57:27,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:35,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:35,315 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=603173.3333333334, ans=0.04949747468305833 2023-09-30 04:57:37,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten.whitening_limit, batch_count=603173.3333333334, ans=15.0 2023-09-30 04:57:39,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:57:39,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 04:57:40,410 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.02 vs. limit=15.0 2023-09-30 04:57:42,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:57:42,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:43,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:57:44,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:57:48,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:49,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:57:51,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:52,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 04:57:58,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:58,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:57:58,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:57:58,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=603240.0, ans=0.0 2023-09-30 04:57:59,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:58:01,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:02,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:58:06,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:58:07,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:58:09,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:12,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:58:12,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 04:58:12,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:58:12,761 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 04:58:17,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:20,326 INFO [train.py:1039] (2/4) Epoch 18, batch 200, loss[loss=0.2147, simple_loss=0.2723, pruned_loss=0.07857, over 23699.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2582, pruned_loss=0.05337, over 3010182.58 frames. ], batch size: 232, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:58:20,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:58:20,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:58:23,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 04:58:23,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:23,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:26,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=603373.3333333334, ans=0.2 2023-09-30 04:58:27,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 04:58:30,021 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.857e+02 2.048e+02 2.282e+02 3.617e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 04:58:30,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:58:31,051 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=15.0 2023-09-30 04:58:32,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:32,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:35,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:58:35,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:35,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:55,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=603506.6666666666, ans=0.125 2023-09-30 04:58:56,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:58:57,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:58:58,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:58:59,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:58:59,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:58:59,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:59:01,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:03,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:59:03,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:03,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:06,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 04:59:06,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:59:06,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:11,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:59:21,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:24,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=603640.0, ans=0.0 2023-09-30 04:59:27,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:29,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:59:35,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:38,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 04:59:38,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:38,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:59:38,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:39,298 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=603640.0, ans=0.125 2023-09-30 04:59:40,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:59:41,950 INFO [train.py:1039] (2/4) Epoch 18, batch 250, loss[loss=0.1782, simple_loss=0.258, pruned_loss=0.04924, over 23147.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2582, pruned_loss=0.05428, over 3378790.37 frames. ], batch size: 105, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:59:42,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 04:59:43,146 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.13 vs. limit=15.0 2023-09-30 04:59:44,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:59:44,107 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 04:59:45,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:59:47,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:52,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:59:52,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:53,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:59:56,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:05,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:09,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:00:09,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:00:16,081 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.30 vs. limit=10.0 2023-09-30 05:00:18,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:00:18,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:00:20,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:00:20,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:22,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:00:22,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:00:22,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:25,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=603840.0, ans=0.0 2023-09-30 05:00:26,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:00:27,281 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.79 vs. limit=22.5 2023-09-30 05:00:29,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 05:00:29,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:31,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:00:32,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:00:32,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:00:32,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:00:32,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=603906.6666666666, ans=0.1 2023-09-30 05:00:35,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:00:35,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:00:37,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:38,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:00:38,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:42,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:00:47,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:50,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:55,662 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=603973.3333333334, ans=0.1 2023-09-30 05:00:56,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:58,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:01:03,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 05:01:04,802 INFO [train.py:1039] (2/4) Epoch 18, batch 300, loss[loss=0.1771, simple_loss=0.2575, pruned_loss=0.04837, over 24284.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.256, pruned_loss=0.05348, over 3655468.19 frames. ], batch size: 61, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:01:04,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:04,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:01:07,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 05:01:07,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:01:09,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:01:09,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 05:01:11,342 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=604040.0, ans=0.125 2023-09-30 05:01:14,091 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.412e+02 1.790e+02 1.961e+02 2.231e+02 3.675e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 05:01:14,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:01:14,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=604040.0, ans=0.0 2023-09-30 05:01:15,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:01:19,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:01:20,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 05:01:20,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:01:23,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:01:23,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 05:01:23,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:27,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:01:34,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:01:35,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 05:01:38,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 05:01:38,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:39,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:42,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:42,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 05:01:42,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:01:45,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:01:46,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:01:47,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:50,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:01:50,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 05:01:52,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:01:55,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:55,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=604240.0, ans=0.2 2023-09-30 05:01:58,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 05:01:59,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:03,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:06,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:02:06,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 05:02:10,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:11,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:02:13,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:14,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:02:16,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 05:02:16,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:02:16,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:16,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 05:02:19,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:20,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:22,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:22,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:23,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:25,358 INFO [train.py:1039] (2/4) Epoch 18, batch 350, loss[loss=0.1772, simple_loss=0.2639, pruned_loss=0.04522, over 24311.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2552, pruned_loss=0.05279, over 3885010.74 frames. ], batch size: 74, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:02:28,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:28,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:02:30,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:35,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:39,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:40,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:43,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 05:02:44,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:46,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 05:02:47,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:47,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 05:02:49,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:51,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 05:02:52,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:02:52,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=604440.0, ans=0.0 2023-09-30 05:02:54,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:55,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:57,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:02:57,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:57,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:03:00,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:00,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:09,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:09,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:03:10,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:03:10,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:16,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 05:03:16,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:22,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:22,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:22,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:03:24,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 05:03:27,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:27,215 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 05:03:30,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 05:03:30,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:33,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:03:33,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 05:03:36,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:39,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:03:41,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:43,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:43,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:46,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:47,930 INFO [train.py:1039] (2/4) Epoch 18, batch 400, loss[loss=0.1892, simple_loss=0.2596, pruned_loss=0.05933, over 23340.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2546, pruned_loss=0.05238, over 4073143.74 frames. ], batch size: 106, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:03:49,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:53,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:03:54,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 05:03:54,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:55,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:55,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:03:57,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:03:58,493 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.749e+02 1.897e+02 2.088e+02 3.470e+02, threshold=3.794e+02, percent-clipped=0.0 2023-09-30 05:04:00,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:01,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:03,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 05:04:04,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 05:04:04,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:06,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 05:04:06,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:11,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:04:11,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:12,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 05:04:12,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:04:12,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:12,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:15,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:04:17,327 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 05:04:18,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 05:04:23,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:24,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:25,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 05:04:27,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 05:04:30,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:04:32,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:38,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 05:04:40,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:04:41,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 05:04:42,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=604906.6666666666, ans=0.1 2023-09-30 05:04:43,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:46,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:04:46,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 05:04:50,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:04:53,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:04:55,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:56,149 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-09-30 05:04:58,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:59,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 05:05:02,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:05:02,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 05:05:05,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:05:05,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:05:07,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 05:05:10,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:05:10,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:05:10,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:05:11,906 INFO [train.py:1039] (2/4) Epoch 18, batch 450, loss[loss=0.1695, simple_loss=0.2527, pruned_loss=0.0432, over 24609.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2556, pruned_loss=0.05283, over 4217600.18 frames. ], batch size: 68, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:05:12,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 05:05:13,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:05:13,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:05:13,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:05:13,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 05:05:15,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:05:16,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:05:18,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:05:20,459 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.18 vs. limit=15.0 2023-09-30 05:05:28,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:29,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:05:32,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 05:05:33,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 05:05:35,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=605106.6666666666, ans=0.1 2023-09-30 05:05:37,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:05:37,162 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=605106.6666666666, ans=0.125 2023-09-30 05:05:40,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:40,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:43,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:44,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:46,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 05:05:48,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 05:05:48,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 05:05:49,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:05:49,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:49,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:05:51,484 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 05:05:51,499 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 05:05:52,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:54,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:05:54,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=605173.3333333334, ans=0.125 2023-09-30 05:05:55,105 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.07 vs. limit=22.5 2023-09-30 05:05:56,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:05:57,163 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=605173.3333333334, ans=0.125 2023-09-30 05:06:01,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:06:02,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:06:02,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:06:03,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 05:06:05,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:08,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:06:08,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:06:11,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 05:06:16,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:06:16,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 05:06:18,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 05:06:18,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:24,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:06:26,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:26,958 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=605306.6666666666, ans=0.0 2023-09-30 05:06:28,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:06:29,584 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 05:06:29,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=605306.6666666666, ans=0.125 2023-09-30 05:06:33,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:34,640 INFO [train.py:1039] (2/4) Epoch 18, batch 500, loss[loss=0.1812, simple_loss=0.2503, pruned_loss=0.05602, over 23372.00 frames. ], tot_loss[loss=0.181, simple_loss=0.256, pruned_loss=0.05295, over 4336581.94 frames. ], batch size: 285, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:06:34,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:06:36,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:36,373 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 05:06:38,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 05:06:38,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:44,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:06:47,413 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.876e+02 2.083e+02 2.292e+02 4.355e+02, threshold=4.166e+02, percent-clipped=1.0 2023-09-30 05:06:48,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:06:50,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:06:53,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:53,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:55,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:06:55,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=605440.0, ans=0.0 2023-09-30 05:07:03,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:07:03,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:07:03,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 05:07:04,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:07:07,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:07:07,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:07:09,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:07:09,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:11,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 05:07:14,966 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 05:07:18,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:20,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:07:23,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=605573.3333333334, ans=0.125 2023-09-30 05:07:24,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 05:07:27,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:07:28,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:32,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:07:33,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=605573.3333333334, ans=0.125 2023-09-30 05:07:35,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:37,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=605573.3333333334, ans=0.1 2023-09-30 05:07:42,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:42,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=605640.0, ans=0.125 2023-09-30 05:07:44,307 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=605640.0, ans=0.0 2023-09-30 05:07:46,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 05:07:46,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:46,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:50,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 05:07:52,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:07:55,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:55,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=605640.0, ans=0.125 2023-09-30 05:07:58,200 INFO [train.py:1039] (2/4) Epoch 18, batch 550, loss[loss=0.2078, simple_loss=0.2756, pruned_loss=0.06997, over 22761.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2575, pruned_loss=0.05388, over 4418912.74 frames. ], batch size: 322, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:08:01,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 05:08:02,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 05:08:02,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:02,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 05:08:02,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:08:02,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:04,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:08:06,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:08:09,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:08:09,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 05:08:09,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:08:13,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:13,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:17,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:18,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:21,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.61 vs. limit=15.0 2023-09-30 05:08:27,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 05:08:28,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 05:08:30,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:08:32,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=605840.0, ans=0.125 2023-09-30 05:08:36,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:08:37,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:38,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:08:41,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:41,592 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 05:08:41,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:42,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=605840.0, ans=0.0 2023-09-30 05:08:43,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:08:46,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:46,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:08:46,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:08:49,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:50,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 05:08:52,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 05:08:52,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:08:52,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:54,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:08:54,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:57,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:08:59,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:08:59,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=605906.6666666666, ans=0.0 2023-09-30 05:09:02,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:09:04,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:06,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 05:09:06,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:09:08,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:08,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:09:10,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:11,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:09:13,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:09:16,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=605973.3333333334, ans=0.125 2023-09-30 05:09:18,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 05:09:19,457 INFO [train.py:1039] (2/4) Epoch 18, batch 600, loss[loss=0.1573, simple_loss=0.2386, pruned_loss=0.03801, over 24593.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2582, pruned_loss=0.05355, over 4482345.81 frames. ], batch size: 60, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:09:22,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 05:09:24,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:09:24,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:09:24,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:30,828 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.862e+02 2.137e+02 2.512e+02 3.782e+02, threshold=4.274e+02, percent-clipped=0.0 2023-09-30 05:09:31,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:09:34,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:09:34,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 05:09:36,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:09:39,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:09:41,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:43,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 05:09:43,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:09:49,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 05:09:54,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:09:54,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:54,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:09:56,730 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.64 vs. limit=22.5 2023-09-30 05:10:02,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:10:02,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:10:02,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:03,494 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.89 vs. limit=15.0 2023-09-30 05:10:11,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:10:12,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:12,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:10:12,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:10:21,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 05:10:25,106 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.44 vs. limit=10.0 2023-09-30 05:10:25,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:10:25,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:10:30,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 05:10:32,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:10:35,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 05:10:35,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:10:35,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:10:42,002 INFO [train.py:1039] (2/4) Epoch 18, batch 650, loss[loss=0.1782, simple_loss=0.2609, pruned_loss=0.04774, over 24489.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2569, pruned_loss=0.05283, over 4538621.41 frames. ], batch size: 66, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:10:43,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:10:44,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.55 vs. limit=15.0 2023-09-30 05:10:45,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:10:47,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:10:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:10:51,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:10:54,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 05:10:55,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:55,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=606373.3333333334, ans=0.0 2023-09-30 05:11:00,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:11:00,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:02,820 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.78 vs. limit=15.0 2023-09-30 05:11:03,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:08,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=606440.0, ans=0.0 2023-09-30 05:11:10,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 05:11:12,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:12,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:13,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=606506.6666666666, ans=0.125 2023-09-30 05:11:14,565 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.96 vs. limit=15.0 2023-09-30 05:11:15,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:15,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:11:18,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:18,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:18,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:11:20,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:22,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:11:24,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:11:25,659 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 05:11:25,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:25,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:28,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:28,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:29,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:30,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:11:30,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 05:11:32,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:11:32,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:11:33,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:11:33,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:36,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:11:38,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 05:11:40,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 05:11:40,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:40,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:40,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:11:41,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:43,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:43,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=606573.3333333334, ans=0.1 2023-09-30 05:11:49,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:50,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:50,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:54,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:54,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:11:54,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:12:04,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:12:04,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:04,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:04,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:05,541 INFO [train.py:1039] (2/4) Epoch 18, batch 700, loss[loss=0.186, simple_loss=0.2618, pruned_loss=0.05516, over 23453.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2548, pruned_loss=0.05259, over 4559506.25 frames. ], batch size: 93, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:12:06,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.90 vs. limit=22.5 2023-09-30 05:12:10,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 05:12:10,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=606706.6666666666, ans=0.125 2023-09-30 05:12:11,090 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.53 vs. limit=15.0 2023-09-30 05:12:11,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 05:12:14,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 05:12:16,480 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.793e+02 1.992e+02 2.237e+02 3.434e+02, threshold=3.985e+02, percent-clipped=0.0 2023-09-30 05:12:16,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:16,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:12:20,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 05:12:25,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:26,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:12:30,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:30,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:12:31,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=606773.3333333334, ans=0.125 2023-09-30 05:12:32,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:12:35,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:37,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:12:37,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:12:37,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=606840.0, ans=0.1 2023-09-30 05:12:38,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 05:12:41,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 05:12:44,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=606840.0, ans=0.0 2023-09-30 05:12:46,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:12:46,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:12:47,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:12:50,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=606840.0, ans=0.125 2023-09-30 05:12:53,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:12:53,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 05:12:59,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:59,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:13:01,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 05:13:02,037 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.44 vs. limit=22.5 2023-09-30 05:13:03,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:13:04,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:08,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:14,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:13:14,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 05:13:15,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 05:13:16,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 05:13:17,638 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=606973.3333333334, ans=0.0 2023-09-30 05:13:19,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=606973.3333333334, ans=0.125 2023-09-30 05:13:20,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:22,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:22,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=606973.3333333334, ans=0.125 2023-09-30 05:13:23,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:13:24,558 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.04 vs. limit=6.0 2023-09-30 05:13:26,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:26,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 05:13:27,523 INFO [train.py:1039] (2/4) Epoch 18, batch 750, loss[loss=0.1728, simple_loss=0.2549, pruned_loss=0.04542, over 24458.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.254, pruned_loss=0.05206, over 4597376.72 frames. ], batch size: 66, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:13:30,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 05:13:30,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 05:13:32,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 05:13:32,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 05:13:33,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 05:13:34,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:13:35,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 05:13:36,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:36,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:13:39,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:42,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:42,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:13:42,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:46,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:13:48,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:13:49,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:13:52,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:52,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:54,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 05:13:54,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=607106.6666666666, ans=0.125 2023-09-30 05:13:55,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:13:55,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:57,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:14:01,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:14:01,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 05:14:02,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:02,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=607173.3333333334, ans=0.125 2023-09-30 05:14:04,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 05:14:04,714 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 05:14:04,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 05:14:04,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:14:06,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:14:07,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:14:13,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:14:13,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:15,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:14:17,076 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=607240.0, ans=0.0 2023-09-30 05:14:18,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:14:19,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:19,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 05:14:19,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:14:20,624 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.34 vs. limit=22.5 2023-09-30 05:14:21,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 05:14:22,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:14:25,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:14:27,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 05:14:27,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:34,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:14:34,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:14:35,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:14:38,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:14:38,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=607306.6666666666, ans=0.0 2023-09-30 05:14:42,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 05:14:42,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:42,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:49,887 INFO [train.py:1039] (2/4) Epoch 18, batch 800, loss[loss=0.1866, simple_loss=0.2691, pruned_loss=0.0521, over 23706.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2559, pruned_loss=0.05259, over 4622783.07 frames. ], batch size: 85, lr: 5.79e-03, grad_scale: 32.0 2023-09-30 05:14:50,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:50,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:14:57,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:57,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:59,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:59,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:59,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:59,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:00,864 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.976e+02 2.273e+02 2.818e+02 4.949e+02, threshold=4.546e+02, percent-clipped=4.0 2023-09-30 05:15:01,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:06,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:06,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:15:10,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 05:15:11,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:13,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:15:13,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:15:13,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:14,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 05:15:14,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:14,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 05:15:19,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:22,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:24,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:15:24,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:27,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:28,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:30,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=607506.6666666666, ans=0.04949747468305833 2023-09-30 05:15:31,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:15:31,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:15:33,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 05:15:34,987 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 05:15:35,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 05:15:35,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:15:35,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:15:37,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:38,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:15:43,182 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 05:15:43,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 05:15:46,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:15:48,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:15:52,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:15:56,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:56,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=607640.0, ans=0.125 2023-09-30 05:15:58,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 05:15:58,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:15:59,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=607640.0, ans=0.125 2023-09-30 05:16:03,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 05:16:04,964 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=607640.0, ans=0.125 2023-09-30 05:16:10,360 INFO [train.py:1039] (2/4) Epoch 18, batch 850, loss[loss=0.1813, simple_loss=0.2633, pruned_loss=0.04967, over 23995.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2558, pruned_loss=0.053, over 4652248.16 frames. ], batch size: 80, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:16:10,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:13,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:16:13,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 05:16:13,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:16:15,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:17,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 05:16:18,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:18,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:16:20,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:21,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:16:22,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:16:24,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 05:16:24,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 05:16:24,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 05:16:24,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=607706.6666666666, ans=0.125 2023-09-30 05:16:25,087 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.19 vs. limit=22.5 2023-09-30 05:16:27,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:27,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:16:28,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=607773.3333333334, ans=0.2 2023-09-30 05:16:29,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:29,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:30,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:16:34,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=607773.3333333334, ans=0.05 2023-09-30 05:16:34,910 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.62 vs. limit=12.0 2023-09-30 05:16:36,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:36,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:16:36,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 05:16:40,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 05:16:42,666 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.33 vs. limit=6.0 2023-09-30 05:16:43,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:45,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 05:16:50,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 05:16:50,486 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 05:16:51,172 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.93 vs. limit=6.0 2023-09-30 05:16:54,208 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 05:16:54,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:16:54,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:16:54,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:16:55,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:57,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:58,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 05:17:00,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:17:01,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:03,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:17:04,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:17:05,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:17:07,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:17:08,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 05:17:13,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:17:13,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:15,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:17:15,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:15,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:18,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:17:21,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:17:21,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=607973.3333333334, ans=0.125 2023-09-30 05:17:23,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:17:23,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:23,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:17:33,407 INFO [train.py:1039] (2/4) Epoch 18, batch 900, loss[loss=0.2006, simple_loss=0.263, pruned_loss=0.06904, over 23547.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2569, pruned_loss=0.05336, over 4682832.36 frames. ], batch size: 256, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:17:33,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:17:34,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:36,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 05:17:36,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:36,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:38,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 05:17:43,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=608040.0, ans=0.02 2023-09-30 05:17:46,197 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.851e+02 2.083e+02 2.531e+02 4.017e+02, threshold=4.166e+02, percent-clipped=0.0 2023-09-30 05:17:46,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:17:47,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:49,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 05:17:52,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:17:52,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 05:17:53,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:17:54,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:54,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:17:56,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:17:56,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:18:06,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=608173.3333333334, ans=0.1 2023-09-30 05:18:08,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:08,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:18:08,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:18:11,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:16,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 05:18:16,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:18:18,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=608173.3333333334, ans=0.0 2023-09-30 05:18:19,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=608173.3333333334, ans=0.125 2023-09-30 05:18:21,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:18:22,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:18:24,609 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 05:18:26,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 05:18:30,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:18:30,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:18:33,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:18:39,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:39,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:18:41,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 05:18:41,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:43,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 05:18:46,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:18:46,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:48,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:18:48,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=608306.6666666666, ans=0.0 2023-09-30 05:18:49,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:18:54,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 05:18:54,398 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 05:18:56,313 INFO [train.py:1039] (2/4) Epoch 18, batch 950, loss[loss=0.195, simple_loss=0.2584, pruned_loss=0.06574, over 23775.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.258, pruned_loss=0.05409, over 4683768.15 frames. ], batch size: 164, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:18:57,333 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.81 vs. limit=6.0 2023-09-30 05:18:58,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:18:58,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 05:18:59,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:02,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 05:19:02,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=608373.3333333334, ans=0.1 2023-09-30 05:19:08,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:11,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:11,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:12,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:19:15,099 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 05:19:20,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:20,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:21,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:21,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:19:21,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 05:19:23,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:19:25,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:26,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 05:19:26,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:33,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:33,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:33,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:34,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 05:19:36,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 05:19:38,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:38,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=608506.6666666666, ans=0.02 2023-09-30 05:19:39,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:19:44,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:19:44,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:48,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 05:19:51,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:19:51,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:19:51,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:19:51,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:51,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:19:55,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=608573.3333333334, ans=0.125 2023-09-30 05:19:56,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 05:19:58,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:20:01,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:01,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:01,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 05:20:01,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:01,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:20:03,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 05:20:08,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:20:08,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:15,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:15,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 05:20:15,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 05:20:19,793 INFO [train.py:1039] (2/4) Epoch 18, batch 1000, loss[loss=0.1781, simple_loss=0.2518, pruned_loss=0.05215, over 23236.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2568, pruned_loss=0.05394, over 4691255.75 frames. ], batch size: 105, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:20:19,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:22,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=608706.6666666666, ans=0.2 2023-09-30 05:20:24,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=608706.6666666666, ans=0.125 2023-09-30 05:20:25,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 05:20:25,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:20:30,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:20:33,033 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.036e+02 2.242e+02 3.072e+02 5.531e+02, threshold=4.484e+02, percent-clipped=11.0 2023-09-30 05:20:33,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 05:20:33,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 05:20:36,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:36,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:38,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:43,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 05:20:46,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 05:20:47,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 05:20:48,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:20:50,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 05:20:51,073 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.66 vs. limit=5.0 2023-09-30 05:20:51,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 05:20:51,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 05:20:53,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:55,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:03,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:03,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:21:05,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:06,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:06,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 05:21:06,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:06,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:21:08,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:08,426 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 05:21:12,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=608906.6666666666, ans=15.0 2023-09-30 05:21:13,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 05:21:13,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 05:21:15,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 05:21:17,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:21:25,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:26,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:21:26,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:28,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:21:29,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 05:21:31,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:21:33,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 05:21:33,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 05:21:36,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:21:36,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:38,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:21:38,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=608973.3333333334, ans=0.05 2023-09-30 05:21:41,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:21:42,894 INFO [train.py:1039] (2/4) Epoch 18, batch 1050, loss[loss=0.1864, simple_loss=0.2655, pruned_loss=0.05369, over 23375.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2556, pruned_loss=0.05329, over 4698291.63 frames. ], batch size: 93, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:21:43,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:46,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=609040.0, ans=0.125 2023-09-30 05:21:47,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:21:47,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:21:49,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:21:51,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:53,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:21:56,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:21:58,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:21:58,731 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.05 vs. limit=15.0 2023-09-30 05:22:01,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:22:01,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:22:01,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:22:02,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:22:02,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 05:22:04,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:05,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=609106.6666666666, ans=0.125 2023-09-30 05:22:06,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 05:22:09,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:22:09,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 05:22:09,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:22:11,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=609106.6666666666, ans=0.125 2023-09-30 05:22:17,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:22:17,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:22:17,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:20,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 05:22:21,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 05:22:21,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:22:26,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 05:22:29,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 05:22:31,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:22:34,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:22:36,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:22:36,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:22:37,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:22:40,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:22:42,498 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.43 vs. limit=22.5 2023-09-30 05:22:45,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 05:22:46,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 05:22:47,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 05:22:47,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:48,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:22:50,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 05:22:53,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:22:53,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=609306.6666666666, ans=0.125 2023-09-30 05:22:54,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:54,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:22:56,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:22:56,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:01,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:01,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 05:23:02,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:23:02,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 05:23:02,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=609306.6666666666, ans=0.1 2023-09-30 05:23:04,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 05:23:05,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:23:06,946 INFO [train.py:1039] (2/4) Epoch 18, batch 1100, loss[loss=0.1979, simple_loss=0.2438, pruned_loss=0.07602, over 19237.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.255, pruned_loss=0.05297, over 4701143.05 frames. ], batch size: 388, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:23:08,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:14,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:23:20,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:23:22,011 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.922e+02 2.115e+02 2.641e+02 4.048e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 05:23:22,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:23:22,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:22,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 05:23:23,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=609440.0, ans=0.0 2023-09-30 05:23:24,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.56 vs. limit=22.5 2023-09-30 05:23:25,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:23:26,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:23:28,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:23:31,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:23:31,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 05:23:33,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:23:35,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:35,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:23:37,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:23:40,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:23:45,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:23:47,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 05:23:49,247 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 05:23:49,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:23:54,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:54,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 05:23:56,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:23:56,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:23:56,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:23:57,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:57,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 05:24:05,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:24:05,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 05:24:06,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:24:12,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:24:15,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 05:24:15,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:24:17,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:20,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:22,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:22,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 05:24:23,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:24:23,853 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:25,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 05:24:25,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:24:26,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 05:24:27,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:24:27,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:24:28,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:24:30,512 INFO [train.py:1039] (2/4) Epoch 18, batch 1150, loss[loss=0.1825, simple_loss=0.2685, pruned_loss=0.04824, over 24667.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2546, pruned_loss=0.05218, over 4715011.96 frames. ], batch size: 68, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:24:33,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:35,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:24:38,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:38,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:24:38,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 05:24:38,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:24:39,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=609706.6666666666, ans=0.125 2023-09-30 05:24:42,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 05:24:43,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:43,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:24:49,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 05:24:52,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:56,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:56,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:24:56,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 05:24:57,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:24:57,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:25:02,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 05:25:02,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:04,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:25:15,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:23,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:23,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 05:25:23,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:24,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:30,978 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 05:25:33,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 05:25:45,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:25:46,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:25:46,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:25:48,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:25:49,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:25:53,405 INFO [train.py:1039] (2/4) Epoch 18, batch 1200, loss[loss=0.1583, simple_loss=0.2448, pruned_loss=0.03595, over 24445.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2551, pruned_loss=0.05257, over 4721507.38 frames. ], batch size: 63, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:25:57,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:25:57,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:25:58,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:58,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:00,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:26:00,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:26:03,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:26:06,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:06,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:08,275 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.757e+02 1.911e+02 2.190e+02 3.350e+02, threshold=3.822e+02, percent-clipped=0.0 2023-09-30 05:26:10,001 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 05:26:11,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 05:26:18,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:26:19,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:26:22,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:24,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:26:24,466 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 05:26:26,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:28,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=610173.3333333334, ans=0.125 2023-09-30 05:26:30,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=610173.3333333334, ans=0.125 2023-09-30 05:26:32,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=610173.3333333334, ans=0.0 2023-09-30 05:26:34,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:26:34,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:26:34,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 05:26:36,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:26:39,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 05:26:39,681 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=610173.3333333334, ans=0.125 2023-09-30 05:26:44,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 05:26:44,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:46,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:47,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:26:47,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:26:48,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:49,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:26:51,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:26:51,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 05:26:51,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:26:51,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=610240.0, ans=0.04949747468305833 2023-09-30 05:26:52,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:26:52,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:26:54,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:55,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:27:00,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:27:02,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:27:06,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 05:27:11,378 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 05:27:14,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:15,616 INFO [train.py:1039] (2/4) Epoch 18, batch 1250, loss[loss=0.1715, simple_loss=0.2482, pruned_loss=0.04735, over 24474.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2555, pruned_loss=0.05303, over 4729465.22 frames. ], batch size: 63, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:27:17,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:27:18,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:27:21,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:27:24,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 05:27:27,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:27:29,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:29,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 05:27:31,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:27:34,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:27:37,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:27:37,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:40,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:27:40,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:42,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:27:47,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:27:47,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:27:47,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:49,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:49,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:27:50,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:27:54,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:27:57,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=610506.6666666666, ans=0.0 2023-09-30 05:27:59,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 05:28:00,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:28:02,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:03,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 05:28:04,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=610573.3333333334, ans=0.0 2023-09-30 05:28:05,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:28:05,537 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 05:28:07,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:07,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:10,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:13,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:13,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:28:15,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 05:28:15,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 05:28:15,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 05:28:18,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:21,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 05:28:21,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:22,010 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=610640.0, ans=0.0 2023-09-30 05:28:23,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:28:25,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:28:26,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 05:28:26,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:28:28,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:28:28,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:28:28,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:30,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 05:28:34,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:35,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:28:35,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:28:37,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=610706.6666666666, ans=0.0 2023-09-30 05:28:38,665 INFO [train.py:1039] (2/4) Epoch 18, batch 1300, loss[loss=0.1871, simple_loss=0.2548, pruned_loss=0.05974, over 23801.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2567, pruned_loss=0.05337, over 4724339.55 frames. ], batch size: 164, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:28:38,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:28:41,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:41,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 05:28:42,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=610706.6666666666, ans=0.2 2023-09-30 05:28:45,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:48,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:28:48,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:28:52,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:52,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:28:53,671 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.896e+02 2.139e+02 2.447e+02 3.795e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-30 05:28:53,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 05:28:59,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:28:59,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:29:00,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 05:29:03,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=610773.3333333334, ans=0.0 2023-09-30 05:29:05,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:29:10,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:10,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:12,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:29:13,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:15,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:29:16,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:29:17,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 05:29:19,814 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=610840.0, ans=0.125 2023-09-30 05:29:24,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:29:24,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:29:26,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 05:29:26,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:29:27,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:29:30,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:29:31,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 05:29:33,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:33,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 05:29:35,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:40,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:40,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:29:43,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 05:29:45,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 05:29:45,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 05:29:47,807 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.79 vs. limit=22.5 2023-09-30 05:29:49,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:29:53,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 05:29:54,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:30:01,234 INFO [train.py:1039] (2/4) Epoch 18, batch 1350, loss[loss=0.1495, simple_loss=0.233, pruned_loss=0.033, over 24500.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2556, pruned_loss=0.05318, over 4721613.90 frames. ], batch size: 58, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:30:01,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 05:30:07,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:08,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:14,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:30:14,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:16,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:30:18,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:21,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:23,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 05:30:24,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:30:24,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:30:27,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 05:30:29,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:30:31,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:30:31,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 05:30:32,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 05:30:34,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 05:30:35,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:35,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 05:30:44,870 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:30:49,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:59,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:59,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:30:59,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 05:31:02,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:03,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 05:31:03,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:31:03,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:31:06,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=611306.6666666666, ans=0.125 2023-09-30 05:31:07,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:31:09,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 05:31:11,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:31:16,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=611306.6666666666, ans=0.0 2023-09-30 05:31:17,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 05:31:21,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 05:31:24,492 INFO [train.py:1039] (2/4) Epoch 18, batch 1400, loss[loss=0.1667, simple_loss=0.2373, pruned_loss=0.04803, over 23623.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.254, pruned_loss=0.0526, over 4716402.04 frames. ], batch size: 134, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:31:26,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 05:31:26,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:29,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:31:31,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:31:35,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 05:31:37,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 05:31:38,681 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.909e+02 2.025e+02 2.334e+02 3.516e+02, threshold=4.051e+02, percent-clipped=0.0 2023-09-30 05:31:48,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:31:51,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:31:51,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=611440.0, ans=0.125 2023-09-30 05:31:54,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:31:55,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:31:55,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=611440.0, ans=0.2 2023-09-30 05:31:59,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:32:00,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:32:07,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:09,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:12,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 05:32:14,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:32:14,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:32:16,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:32:16,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:20,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:32:20,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:32:21,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:32:21,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 05:32:21,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:32:28,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:31,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:32:39,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 05:32:40,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:32:40,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:32:43,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:32:44,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:45,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:32:47,011 INFO [train.py:1039] (2/4) Epoch 18, batch 1450, loss[loss=0.1671, simple_loss=0.2405, pruned_loss=0.04682, over 19549.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2532, pruned_loss=0.05224, over 4716882.91 frames. ], batch size: 42, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:32:48,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:32:52,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:32:52,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:52,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:32:58,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:58,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:32:59,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:59,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 05:33:01,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:33:03,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 05:33:04,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:05,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:05,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 05:33:05,569 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.15 vs. limit=15.0 2023-09-30 05:33:06,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:07,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:33:08,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 05:33:08,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:10,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:33:12,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:14,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:18,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:33:18,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:33:21,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:33:21,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:24,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:33:24,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:30,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 05:33:31,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:37,098 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 05:33:38,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:40,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:33:40,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:41,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 05:33:41,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=611906.6666666666, ans=0.07 2023-09-30 05:33:46,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:47,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 05:33:49,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 05:33:50,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:53,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:33:53,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:55,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 05:33:59,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 05:33:59,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 05:34:01,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:05,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:34:09,828 INFO [train.py:1039] (2/4) Epoch 18, batch 1500, loss[loss=0.1562, simple_loss=0.2439, pruned_loss=0.03419, over 24466.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2549, pruned_loss=0.0526, over 4729521.92 frames. ], batch size: 63, lr: 5.77e-03, grad_scale: 8.0 2023-09-30 05:34:10,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=612040.0, ans=0.09899494936611666 2023-09-30 05:34:15,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 05:34:15,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:34:15,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:34:15,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:16,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:18,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:34:18,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 05:34:20,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:34:21,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:34:21,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:22,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:34:24,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:34:24,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:25,933 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.869e+02 2.047e+02 2.380e+02 3.622e+02, threshold=4.094e+02, percent-clipped=0.0 2023-09-30 05:34:30,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=612106.6666666666, ans=0.2 2023-09-30 05:34:31,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=612106.6666666666, ans=0.2 2023-09-30 05:34:32,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:32,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 05:34:34,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:34:34,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:34:35,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:41,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 05:34:46,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 05:34:46,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:47,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 05:34:49,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:34:51,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:34:53,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:53,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:34:54,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 05:34:54,454 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:34:55,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:34:55,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 05:34:56,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:35:03,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:35:03,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 05:35:08,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:35:11,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:35:14,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=612306.6666666666, ans=0.025 2023-09-30 05:35:16,384 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 05:35:16,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:16,464 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 05:35:18,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:18,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:35:18,824 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.70 vs. limit=15.0 2023-09-30 05:35:19,700 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 05:35:21,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:35:24,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 05:35:26,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:30,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:30,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:30,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:31,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:32,357 INFO [train.py:1039] (2/4) Epoch 18, batch 1550, loss[loss=0.1845, simple_loss=0.2603, pruned_loss=0.05441, over 23433.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2559, pruned_loss=0.05303, over 4718825.75 frames. ], batch size: 93, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:35:32,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:35:32,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 05:35:34,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 05:35:34,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:35:35,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 05:35:35,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 05:35:38,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:40,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:41,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:35:41,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:35:43,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:45,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:46,795 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 05:35:46,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:46,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:35:49,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:35:52,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:35:52,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 05:35:52,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:53,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 05:35:53,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 05:35:53,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 05:35:55,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:57,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:00,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:36:03,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 05:36:04,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 05:36:06,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=612506.6666666666, ans=0.125 2023-09-30 05:36:07,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=612506.6666666666, ans=0.1 2023-09-30 05:36:08,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=612506.6666666666, ans=0.125 2023-09-30 05:36:12,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:15,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:36:15,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:36:16,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:36:16,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=612506.6666666666, ans=10.0 2023-09-30 05:36:17,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 05:36:19,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=612573.3333333334, ans=0.07 2023-09-30 05:36:21,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:36:24,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:27,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:36:28,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=612573.3333333334, ans=0.125 2023-09-30 05:36:30,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:36:30,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:30,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 05:36:30,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:33,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:36:33,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:33,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=612573.3333333334, ans=0.125 2023-09-30 05:36:35,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:36:35,145 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 05:36:36,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:36:38,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=612640.0, ans=0.125 2023-09-30 05:36:44,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 05:36:46,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=612640.0, ans=0.125 2023-09-30 05:36:47,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:49,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:51,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 05:36:52,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:54,070 INFO [train.py:1039] (2/4) Epoch 18, batch 1600, loss[loss=0.169, simple_loss=0.2597, pruned_loss=0.03914, over 24651.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2568, pruned_loss=0.05344, over 4704624.58 frames. ], batch size: 68, lr: 5.76e-03, grad_scale: 16.0 2023-09-30 05:36:54,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:56,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:36:56,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:36:57,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:37:01,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:02,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 05:37:03,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 05:37:05,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 05:37:08,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:08,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=612706.6666666666, ans=0.2 2023-09-30 05:37:09,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 05:37:11,284 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.776e+02 1.986e+02 2.218e+02 2.701e+02, threshold=3.972e+02, percent-clipped=0.0 2023-09-30 05:37:11,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:37:13,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:37:16,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:37:20,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 05:37:22,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:37:22,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 05:37:22,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:24,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 05:37:29,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 05:37:36,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:36,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 05:37:38,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:38,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:38,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:37:41,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 05:37:46,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 05:37:49,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:37:50,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:50,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:52,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:37:53,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:37:55,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:37:57,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:38:04,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=612973.3333333334, ans=0.125 2023-09-30 05:38:05,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=612973.3333333334, ans=0.02 2023-09-30 05:38:06,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:06,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:38:10,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 05:38:10,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:38:11,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 05:38:12,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=612973.3333333334, ans=0.125 2023-09-30 05:38:13,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=612973.3333333334, ans=0.0 2023-09-30 05:38:14,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=612973.3333333334, ans=0.0 2023-09-30 05:38:16,813 INFO [train.py:1039] (2/4) Epoch 18, batch 1650, loss[loss=0.1767, simple_loss=0.2599, pruned_loss=0.04675, over 24478.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2572, pruned_loss=0.05332, over 4703620.49 frames. ], batch size: 66, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:38:17,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:19,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:38:21,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:38:21,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 05:38:21,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 05:38:21,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 05:38:21,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 05:38:26,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:26,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:27,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:38:27,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:38:29,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:32,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 05:38:35,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:38:35,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:35,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:38:35,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:38:36,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=613106.6666666666, ans=0.1 2023-09-30 05:38:37,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 05:38:37,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 05:38:44,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:38:46,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:38:54,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 05:38:56,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:38:57,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 05:39:02,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:03,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:39:06,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:39:06,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:06,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:39:06,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:07,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:09,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:09,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:09,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:11,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:13,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:39:13,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=613240.0, ans=0.0 2023-09-30 05:39:13,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=613240.0, ans=0.125 2023-09-30 05:39:16,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:18,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 05:39:19,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.89 vs. limit=22.5 2023-09-30 05:39:19,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:20,777 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.64 vs. limit=6.0 2023-09-30 05:39:21,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 05:39:21,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 05:39:21,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 05:39:21,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:23,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:39:23,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:24,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:24,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 05:39:28,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:33,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:39:33,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:37,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 05:39:41,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:41,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:39:41,814 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:39:42,813 INFO [train.py:1039] (2/4) Epoch 18, batch 1700, loss[loss=0.1743, simple_loss=0.2494, pruned_loss=0.04954, over 21654.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2562, pruned_loss=0.05259, over 4710320.55 frames. ], batch size: 47, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:39:42,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 05:39:42,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:39:42,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:39:42,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:43,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=613373.3333333334, ans=0.025 2023-09-30 05:39:43,578 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.78 vs. limit=6.0 2023-09-30 05:39:44,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:39:46,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:39:46,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 05:39:50,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:39:54,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=613373.3333333334, ans=0.2 2023-09-30 05:40:00,296 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.875e+02 2.086e+02 2.365e+02 3.763e+02, threshold=4.171e+02, percent-clipped=0.0 2023-09-30 05:40:00,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:02,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:40:08,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:40:10,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:10,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:40:11,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:13,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=613506.6666666666, ans=0.125 2023-09-30 05:40:14,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 05:40:15,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:40:16,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:16,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:40:17,433 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.42 vs. limit=22.5 2023-09-30 05:40:18,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:40:19,666 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.91 vs. limit=15.0 2023-09-30 05:40:20,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 05:40:20,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 05:40:23,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:25,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 05:40:26,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:40:28,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=613506.6666666666, ans=0.07 2023-09-30 05:40:34,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=613573.3333333334, ans=0.125 2023-09-30 05:40:37,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:38,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:40,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:41,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:40:41,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 05:40:41,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:43,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:43,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 05:40:44,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:40:44,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:44,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:44,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:40:46,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:46,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:40:48,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:49,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:40:49,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:54,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:56,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 05:40:58,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:58,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:01,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 05:41:05,584 INFO [train.py:1039] (2/4) Epoch 18, batch 1750, loss[loss=0.1504, simple_loss=0.1958, pruned_loss=0.05255, over 19223.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2539, pruned_loss=0.05282, over 4691381.60 frames. ], batch size: 388, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:41:08,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:11,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:11,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:41:11,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 05:41:13,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:41:16,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:41:16,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:21,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 05:41:23,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:23,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=613773.3333333334, ans=0.125 2023-09-30 05:41:24,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 05:41:24,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:26,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:41:28,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:41:31,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 05:41:32,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:41:34,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 05:41:41,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:41:44,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:41:45,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:49,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:49,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:49,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=613840.0, ans=0.0 2023-09-30 05:41:52,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:54,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:56,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:41:58,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:59,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 05:41:59,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:41:59,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=613906.6666666666, ans=0.0 2023-09-30 05:42:01,899 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.12 vs. limit=15.0 2023-09-30 05:42:04,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 05:42:04,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:05,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:07,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:42:11,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:42:11,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:42:13,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:15,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:18,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:21,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:23,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:42:23,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 05:42:23,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:24,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:42:24,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:24,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:42:24,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:42:27,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:42:28,511 INFO [train.py:1039] (2/4) Epoch 18, batch 1800, loss[loss=0.1742, simple_loss=0.2421, pruned_loss=0.05311, over 23393.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2537, pruned_loss=0.05202, over 4705074.56 frames. ], batch size: 285, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:42:30,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:42:31,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:33,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:42:34,417 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.40 vs. limit=22.5 2023-09-30 05:42:36,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:38,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=614040.0, ans=0.125 2023-09-30 05:42:38,635 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.37 vs. limit=22.5 2023-09-30 05:42:39,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:42:41,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:42:45,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:42:46,620 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.378e+02 1.840e+02 1.957e+02 2.222e+02 3.420e+02, threshold=3.915e+02, percent-clipped=0.0 2023-09-30 05:42:48,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:48,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:50,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:42:52,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:52,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 05:42:54,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:42:58,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:00,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=614173.3333333334, ans=0.125 2023-09-30 05:43:01,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 05:43:03,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 05:43:03,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 05:43:05,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:06,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:43:06,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:06,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=614173.3333333334, ans=0.125 2023-09-30 05:43:08,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:43:12,952 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 05:43:14,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:43:16,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:18,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 05:43:19,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 05:43:21,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:43:22,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:43:22,281 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=614240.0, ans=0.04949747468305833 2023-09-30 05:43:24,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:43:26,133 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.21 vs. limit=6.0 2023-09-30 05:43:28,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=614240.0, ans=0.0 2023-09-30 05:43:30,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 05:43:37,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:43:37,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 05:43:39,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:43:39,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:39,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:43:41,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 05:43:44,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:43:44,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:43:46,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 05:43:46,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:46,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=614306.6666666666, ans=0.0 2023-09-30 05:43:49,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:49,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:43:49,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:50,779 INFO [train.py:1039] (2/4) Epoch 18, batch 1850, loss[loss=0.1775, simple_loss=0.2599, pruned_loss=0.04753, over 24656.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2546, pruned_loss=0.05218, over 4716899.41 frames. ], batch size: 65, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:43:52,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:52,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:43:54,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:55,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:58,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:43:58,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:00,143 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=614373.3333333334, ans=0.125 2023-09-30 05:44:03,840 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=614373.3333333334, ans=0.1 2023-09-30 05:44:06,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:44:07,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 05:44:08,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=614440.0, ans=0.035 2023-09-30 05:44:09,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 05:44:11,554 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=614440.0, ans=0.1 2023-09-30 05:44:14,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 05:44:17,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:17,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 05:44:17,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:44:27,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:44:30,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 05:44:33,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:44:33,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:44:38,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 05:44:38,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:38,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=614506.6666666666, ans=0.125 2023-09-30 05:44:39,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:44:39,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:44:41,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:44,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:44:46,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:44:48,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:48,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:44:48,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:51,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:52,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:44:55,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 05:44:55,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:58,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:44:58,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:44:58,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 05:44:58,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 05:45:02,525 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 05:45:04,255 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 05:45:04,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:45:04,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:45:06,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:06,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:07,995 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 05:45:08,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:45:08,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:09,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:45:09,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:45:11,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:45:11,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 05:45:14,124 INFO [train.py:1039] (2/4) Epoch 18, batch 1900, loss[loss=0.1625, simple_loss=0.24, pruned_loss=0.04254, over 24321.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2556, pruned_loss=0.05282, over 4706444.09 frames. ], batch size: 61, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:45:14,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:14,324 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 05:45:14,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:45:15,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:21,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:23,346 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.50 vs. limit=15.0 2023-09-30 05:45:24,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:45:24,224 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 05:45:24,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 05:45:24,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=614706.6666666666, ans=0.025 2023-09-30 05:45:25,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:27,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:45:27,279 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 05:45:28,796 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 05:45:29,279 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=614773.3333333334, ans=0.0 2023-09-30 05:45:31,883 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.898e+02 2.159e+02 2.520e+02 3.634e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-30 05:45:33,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 05:45:35,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:45:39,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 05:45:41,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 05:45:47,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=614840.0, ans=0.2 2023-09-30 05:45:48,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 05:45:52,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 05:45:54,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:54,144 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 05:45:54,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 05:45:54,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 05:45:54,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 05:45:54,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:45:56,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=614840.0, ans=0.125 2023-09-30 05:45:56,164 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=614840.0, ans=0.0 2023-09-30 05:45:57,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=614840.0, ans=0.1 2023-09-30 05:45:58,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 05:46:01,165 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.00 vs. limit=15.0 2023-09-30 05:46:01,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:46:02,272 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:46:07,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:07,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 05:46:07,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:46:07,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=614906.6666666666, ans=0.1 2023-09-30 05:46:11,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 05:46:11,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:19,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:46:19,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:46:19,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:46:19,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:46:20,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:46:22,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:46:22,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:46:26,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:26,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:46:29,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:46:29,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:29,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:30,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:33,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:36,704 INFO [train.py:1039] (2/4) Epoch 18, batch 1950, loss[loss=0.1792, simple_loss=0.2495, pruned_loss=0.0545, over 23532.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2566, pruned_loss=0.05307, over 4713970.91 frames. ], batch size: 120, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:46:36,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:46:38,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:38,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:46:40,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 05:46:40,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:46:40,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:42,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:45,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:46:45,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:46:45,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:48,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:46:50,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:50,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:46:50,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:46:52,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:55,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=615106.6666666666, ans=0.2 2023-09-30 05:46:56,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:58,844 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.77 vs. limit=22.5 2023-09-30 05:47:00,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:47:00,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:00,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:47:00,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 05:47:01,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:47:01,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:47:03,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:08,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:09,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:47:13,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=615173.3333333334, ans=0.1 2023-09-30 05:47:16,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:47:19,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:47:20,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:47:21,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 05:47:22,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:47:26,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:47:28,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:47:29,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:36,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=615240.0, ans=10.0 2023-09-30 05:47:39,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:39,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:41,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:44,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:46,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:47:47,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:47,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 05:47:47,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:47:49,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:51,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 05:47:52,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:47:57,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:58,933 INFO [train.py:1039] (2/4) Epoch 18, batch 2000, loss[loss=0.1736, simple_loss=0.2554, pruned_loss=0.04584, over 24465.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.257, pruned_loss=0.0534, over 4713066.53 frames. ], batch size: 66, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:47:59,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:48:00,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:01,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=615373.3333333334, ans=0.0 2023-09-30 05:48:02,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:48:04,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:09,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 05:48:09,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:48:12,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:48:16,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 05:48:16,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:48:17,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:48:19,245 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.900e+02 2.106e+02 2.415e+02 3.499e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 05:48:20,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:48:22,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 05:48:24,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:29,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 05:48:29,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:48:30,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 05:48:30,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:34,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:48:35,321 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=615506.6666666666, ans=0.1 2023-09-30 05:48:36,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:48:36,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:37,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:39,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:48:39,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 05:48:43,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 05:48:43,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:43,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:48:49,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:51,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:48:51,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:51,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:54,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:54,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=615573.3333333334, ans=0.2 2023-09-30 05:48:55,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:55,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:55,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:56,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:00,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:49:00,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 05:49:01,252 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.15 vs. limit=22.5 2023-09-30 05:49:07,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:49:07,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=615640.0, ans=0.1 2023-09-30 05:49:08,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:49:15,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:16,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:16,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:19,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:49:19,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:49:20,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:22,090 INFO [train.py:1039] (2/4) Epoch 18, batch 2050, loss[loss=0.1976, simple_loss=0.2607, pruned_loss=0.06718, over 23776.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.256, pruned_loss=0.05362, over 4706785.77 frames. ], batch size: 179, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:49:22,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:25,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:27,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:31,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:49:35,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:49:36,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:38,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:49:40,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 05:49:40,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:49:40,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:49:42,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:49:45,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=615773.3333333334, ans=0.05 2023-09-30 05:49:46,918 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=615773.3333333334, ans=0.125 2023-09-30 05:49:50,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=615773.3333333334, ans=0.125 2023-09-30 05:49:51,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:49:52,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:53,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 05:49:55,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:57,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=615840.0, ans=0.0 2023-09-30 05:49:58,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 05:49:58,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:50:03,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:04,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:05,043 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:50:06,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:06,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:50:06,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=615840.0, ans=0.0 2023-09-30 05:50:08,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:50:09,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:50:14,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:16,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:50:18,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:50:18,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:50:22,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=615906.6666666666, ans=0.2 2023-09-30 05:50:23,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:28,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:50:30,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 05:50:32,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=615973.3333333334, ans=0.0 2023-09-30 05:50:35,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:35,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=615973.3333333334, ans=0.125 2023-09-30 05:50:37,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:50:40,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:50:43,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 05:50:44,549 INFO [train.py:1039] (2/4) Epoch 18, batch 2100, loss[loss=0.1783, simple_loss=0.2523, pruned_loss=0.05216, over 23239.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2551, pruned_loss=0.05302, over 4712621.17 frames. ], batch size: 119, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:50:46,769 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 05:50:46,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:48,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:48,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:50:49,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:49,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 05:50:51,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 05:50:53,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:56,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:50:57,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:50:59,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:59,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:00,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 05:51:01,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:51:01,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 05:51:01,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 05:51:03,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:03,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:03,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 05:51:04,692 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.913e+02 2.117e+02 2.530e+02 3.526e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 05:51:04,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 05:51:10,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 05:51:10,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:51:10,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.47 vs. limit=10.0 2023-09-30 05:51:11,784 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:51:14,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:51:14,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:51:17,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:51:19,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 05:51:21,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:21,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:51:22,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 05:51:22,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:22,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 05:51:23,036 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=616173.3333333334, ans=0.125 2023-09-30 05:51:24,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 05:51:24,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 05:51:26,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=616173.3333333334, ans=0.5 2023-09-30 05:51:28,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:51:28,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=616173.3333333334, ans=0.125 2023-09-30 05:51:29,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:51:31,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:34,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:35,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:39,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 05:51:39,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:39,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:41,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 05:51:42,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 05:51:42,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 05:51:46,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=616240.0, ans=0.0 2023-09-30 05:51:47,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:51:50,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:50,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 05:51:55,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:58,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:51:58,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=616306.6666666666, ans=0.0 2023-09-30 05:51:58,851 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.36 vs. limit=15.0 2023-09-30 05:51:59,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:59,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:51:59,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:51:59,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:01,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:02,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:52:02,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:52:02,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:03,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=616306.6666666666, ans=0.125 2023-09-30 05:52:04,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 05:52:05,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 05:52:05,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:07,385 INFO [train.py:1039] (2/4) Epoch 18, batch 2150, loss[loss=0.1736, simple_loss=0.2188, pruned_loss=0.06422, over 19314.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2534, pruned_loss=0.05275, over 4706667.39 frames. ], batch size: 389, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:52:08,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:52:08,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:52:09,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:52:09,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:52:16,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:52:17,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:19,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:21,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:52:21,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:21,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:52:25,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:27,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:52:27,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:52:30,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:30,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 05:52:36,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:36,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.54 vs. limit=22.5 2023-09-30 05:52:37,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:52:39,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:39,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:52:40,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:40,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:52:40,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:42,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 05:52:43,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:52:45,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:46,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:47,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:47,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:52:50,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:50,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:52:53,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:53,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 05:52:53,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:52:57,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:58,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:58,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:53:00,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:53:00,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:01,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:01,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 05:53:05,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 05:53:05,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:53:05,872 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 05:53:05,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:05,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:53:07,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 05:53:07,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:53:07,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 05:53:07,400 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 05:53:07,400 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 05:53:07,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 05:53:10,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:10,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:53:10,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:53:12,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:13,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:53:15,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:15,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:24,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:53:24,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 05:53:27,217 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=616640.0, ans=0.1 2023-09-30 05:53:29,814 INFO [train.py:1039] (2/4) Epoch 18, batch 2200, loss[loss=0.1599, simple_loss=0.2373, pruned_loss=0.0413, over 24609.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2541, pruned_loss=0.05304, over 4707779.00 frames. ], batch size: 60, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:53:29,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:53:33,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:33,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:53:33,455 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=616706.6666666666, ans=0.125 2023-09-30 05:53:34,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:53:37,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:53:40,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:40,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:53:40,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 05:53:45,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=616773.3333333334, ans=0.0 2023-09-30 05:53:46,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 05:53:47,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:53:49,934 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.966e+02 2.292e+02 2.651e+02 4.144e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-30 05:53:56,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 05:53:59,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:00,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:02,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:54:03,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:54:05,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 05:54:09,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:54:11,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:11,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:54:15,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:54:18,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:21,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:54:22,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:24,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 05:54:26,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:28,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 05:54:30,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:30,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:54:30,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:32,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:33,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:33,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:33,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:35,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:54:35,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:54:36,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:54:39,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:54:40,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:54:40,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=616973.3333333334, ans=0.125 2023-09-30 05:54:40,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=616973.3333333334, ans=0.0 2023-09-30 05:54:42,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:54:44,456 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 05:54:44,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:54:46,752 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 05:54:46,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:54:47,633 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 05:54:50,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:50,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:54:52,043 INFO [train.py:1039] (2/4) Epoch 18, batch 2250, loss[loss=0.1829, simple_loss=0.25, pruned_loss=0.05786, over 23724.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2554, pruned_loss=0.05298, over 4716123.37 frames. ], batch size: 232, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:54:52,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:53,752 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 05:54:55,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:54:57,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:55:01,321 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:55:03,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:55:05,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=617040.0, ans=0.125 2023-09-30 05:55:06,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:55:09,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:10,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:11,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:55:14,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 05:55:14,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:14,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:55:17,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 05:55:19,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:55:19,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:20,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:26,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:27,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:55:29,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:55:29,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=617173.3333333334, ans=0.1 2023-09-30 05:55:30,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 05:55:30,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:34,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:55:39,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:41,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:42,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:55:42,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:44,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:45,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:55:50,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:55:53,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:55:59,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:55:59,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:55:59,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:56:01,723 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.84 vs. limit=15.0 2023-09-30 05:56:04,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:56:07,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:56:07,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 05:56:07,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:09,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:56:10,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=617306.6666666666, ans=0.125 2023-09-30 05:56:11,083 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=617306.6666666666, ans=0.125 2023-09-30 05:56:12,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 05:56:14,354 INFO [train.py:1039] (2/4) Epoch 18, batch 2300, loss[loss=0.1911, simple_loss=0.264, pruned_loss=0.05907, over 23679.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2554, pruned_loss=0.05291, over 4721764.62 frames. ], batch size: 135, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:56:14,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:56:14,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:20,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:20,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:56:23,517 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 05:56:26,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:33,609 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.928e+02 2.258e+02 2.853e+02 4.796e+02, threshold=4.517e+02, percent-clipped=2.0 2023-09-30 05:56:33,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:56:33,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:56:33,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:56:33,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:33,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 05:56:35,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:56:38,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:56:38,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:56:40,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:56:42,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:56:47,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:56:52,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:56:53,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:56,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:57:00,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:04,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:57:05,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:57:05,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:57:05,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 05:57:06,210 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=617573.3333333334, ans=0.125 2023-09-30 05:57:10,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:57:10,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:10,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:10,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:57:12,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:12,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:57:12,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:57:13,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 05:57:13,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:57:13,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:13,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 05:57:17,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=617573.3333333334, ans=0.125 2023-09-30 05:57:22,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:57:25,776 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=617640.0, ans=0.0 2023-09-30 05:57:27,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:57:30,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:30,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:57:31,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:57:35,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:57:35,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:57:35,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:57:35,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 05:57:37,195 INFO [train.py:1039] (2/4) Epoch 18, batch 2350, loss[loss=0.1602, simple_loss=0.24, pruned_loss=0.04023, over 24451.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2565, pruned_loss=0.05351, over 4712417.15 frames. ], batch size: 63, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:57:42,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:57:42,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 05:57:47,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 05:57:49,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:52,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:57:54,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:56,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 05:57:57,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:58:01,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 05:58:02,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:58:06,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:58:06,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:58:11,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:58:13,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 05:58:14,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:58:14,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:58:14,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:16,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:58:19,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:58:22,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 05:58:22,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:58:25,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:58:26,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:58:29,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=617906.6666666666, ans=0.125 2023-09-30 05:58:31,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 05:58:31,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:58:34,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 05:58:34,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:58:40,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 05:58:43,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 05:58:45,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:45,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:58:45,369 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 05:58:45,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 05:58:48,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 05:58:50,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:58:50,888 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=617973.3333333334, ans=0.125 2023-09-30 05:58:56,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:58:59,658 INFO [train.py:1039] (2/4) Epoch 18, batch 2400, loss[loss=0.1744, simple_loss=0.2474, pruned_loss=0.05066, over 23553.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2559, pruned_loss=0.05325, over 4721343.67 frames. ], batch size: 134, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 05:58:59,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:59:02,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:59:03,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 05:59:04,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 05:59:13,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:59:13,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:59:14,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 05:59:17,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:59:18,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:18,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 05:59:20,011 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.950e+02 2.202e+02 2.533e+02 3.814e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-30 05:59:25,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:26,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 05:59:31,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:59:34,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 05:59:38,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:59:38,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=618173.3333333334, ans=0.025 2023-09-30 05:59:42,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:46,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:59:46,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 05:59:46,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:59:58,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:01,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:03,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:06,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:00:06,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:00:06,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:00:06,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:06,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:06,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:00:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:11,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:00:12,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 06:00:12,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 06:00:14,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:00:16,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:16,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 06:00:16,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 06:00:16,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 06:00:16,653 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 06:00:18,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 06:00:20,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:00:20,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:20,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=618306.6666666666, ans=0.0 2023-09-30 06:00:21,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:23,278 INFO [train.py:1039] (2/4) Epoch 18, batch 2450, loss[loss=0.1847, simple_loss=0.2411, pruned_loss=0.06415, over 23614.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2546, pruned_loss=0.05297, over 4699926.40 frames. ], batch size: 256, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:00:23,333 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 06:00:24,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:24,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:00:28,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:00:28,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:33,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:33,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:35,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 06:00:35,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=618373.3333333334, ans=0.125 2023-09-30 06:00:35,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=618373.3333333334, ans=0.0 2023-09-30 06:00:38,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:38,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:43,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:00:43,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:00:43,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:00:44,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 06:00:49,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:51,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:00:53,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:56,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:00:56,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:01:01,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 06:01:03,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:01:08,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:10,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:01:11,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:11,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:01:12,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:13,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:01:14,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 06:01:20,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:01:20,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:01:20,952 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.36 vs. limit=22.5 2023-09-30 06:01:24,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:01:24,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:31,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:01:31,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 06:01:31,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=618640.0, ans=0.1 2023-09-30 06:01:32,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:01:32,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:01:32,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 06:01:34,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:01:34,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:01:34,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=618640.0, ans=0.2 2023-09-30 06:01:39,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:01:42,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:43,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:01:46,464 INFO [train.py:1039] (2/4) Epoch 18, batch 2500, loss[loss=0.1927, simple_loss=0.2635, pruned_loss=0.061, over 23422.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2546, pruned_loss=0.05247, over 4709535.12 frames. ], batch size: 93, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:01:46,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 06:01:48,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:01:51,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=618706.6666666666, ans=0.1 2023-09-30 06:01:54,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:03,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:02:04,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:02:04,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:04,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 06:02:06,275 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.833e+02 2.012e+02 2.316e+02 3.261e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 06:02:09,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=618773.3333333334, ans=0.125 2023-09-30 06:02:09,796 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=618773.3333333334, ans=0.1 2023-09-30 06:02:13,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:02:13,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:14,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:02:14,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:02:14,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 06:02:18,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:18,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:18,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=618840.0, ans=0.07 2023-09-30 06:02:19,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 06:02:19,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:20,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 06:02:20,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:24,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:02:26,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:27,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:02:29,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 06:02:31,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:02:32,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:38,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:41,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:45,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:02:49,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:02:53,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 06:02:53,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:53,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:02:56,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:02:56,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:02:56,572 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 06:02:56,573 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 06:02:56,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 06:03:01,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:04,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 06:03:04,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 06:03:04,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:03:04,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 06:03:09,325 INFO [train.py:1039] (2/4) Epoch 18, batch 2550, loss[loss=0.1748, simple_loss=0.2512, pruned_loss=0.0492, over 23368.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2551, pruned_loss=0.05233, over 4711916.92 frames. ], batch size: 119, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:03:09,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 06:03:09,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=619040.0, ans=0.125 2023-09-30 06:03:11,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:11,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:03:13,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:03:16,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:16,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 06:03:18,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:03:21,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 06:03:23,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:03:25,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=619106.6666666666, ans=0.125 2023-09-30 06:03:26,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:28,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:03:28,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 06:03:29,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:03:29,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:31,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:31,962 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.66 vs. limit=6.0 2023-09-30 06:03:33,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:03:34,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 06:03:34,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:03:34,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:34,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 06:03:38,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.55 vs. limit=22.5 2023-09-30 06:03:48,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:03:53,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:03:53,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:53,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:55,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:04:01,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:04:06,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:04:06,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:04:06,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:04:07,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:04:07,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:04:12,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:12,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:17,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:04:17,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 06:04:17,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:04:19,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:20,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:04:20,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:04:22,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:31,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:04:32,409 INFO [train.py:1039] (2/4) Epoch 18, batch 2600, loss[loss=0.1707, simple_loss=0.2488, pruned_loss=0.04636, over 24651.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2556, pruned_loss=0.05243, over 4726303.99 frames. ], batch size: 65, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:04:32,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:35,682 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 06:04:35,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=619373.3333333334, ans=0.2 2023-09-30 06:04:37,740 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-09-30 06:04:39,292 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 06:04:39,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:04:40,691 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 06:04:40,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 06:04:40,859 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 06:04:44,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:44,780 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 06:04:44,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=619373.3333333334, ans=0.0 2023-09-30 06:04:46,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 06:04:47,846 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 06:04:48,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=619440.0, ans=0.07 2023-09-30 06:04:49,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:04:51,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 06:04:52,580 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.862e+02 2.048e+02 2.291e+02 3.453e+02, threshold=4.097e+02, percent-clipped=0.0 2023-09-30 06:04:52,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 06:04:54,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:04:54,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 06:04:55,943 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 06:04:57,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 06:05:04,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:04,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:04,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:04,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 06:05:06,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=619506.6666666666, ans=0.0 2023-09-30 06:05:07,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:05:11,474 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.70 vs. limit=15.0 2023-09-30 06:05:16,015 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 06:05:23,310 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.26 vs. limit=22.5 2023-09-30 06:05:24,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:24,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:25,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 06:05:25,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:25,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:27,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 06:05:30,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:05:30,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:05:33,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:38,585 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 06:05:38,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:38,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:05:38,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=619640.0, ans=0.125 2023-09-30 06:05:43,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:45,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:05:45,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 06:05:45,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:50,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:05:50,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:05:54,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 06:05:55,847 INFO [train.py:1039] (2/4) Epoch 18, batch 2650, loss[loss=0.1619, simple_loss=0.2383, pruned_loss=0.04276, over 24644.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2556, pruned_loss=0.0523, over 4733908.00 frames. ], batch size: 60, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:05:56,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:57,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:06:00,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 06:06:00,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:02,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:06:02,459 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 06:06:02,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:06,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:08,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:06:08,773 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:06:08,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=619706.6666666666, ans=0.1 2023-09-30 06:06:10,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:06:12,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:06:13,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 06:06:13,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:06:13,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:06:18,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 06:06:20,319 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 06:06:23,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:26,000 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.00 vs. limit=15.0 2023-09-30 06:06:26,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 06:06:28,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:28,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 06:06:29,059 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.10 vs. limit=15.0 2023-09-30 06:06:33,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:33,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:06:33,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:34,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:35,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=619840.0, ans=0.0 2023-09-30 06:06:35,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=619840.0, ans=0.125 2023-09-30 06:06:38,515 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=619840.0, ans=0.2 2023-09-30 06:06:39,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 06:06:39,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 06:06:41,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:06:41,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=619840.0, ans=0.1 2023-09-30 06:06:44,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 06:06:44,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:46,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:46,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:06:47,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:48,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:51,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:53,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:06:54,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:54,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:06:56,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:06:57,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:59,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:06:59,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:01,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:07:02,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:07:05,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:06,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:07:06,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:06,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 06:07:10,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:07:11,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:14,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:15,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:16,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:07:16,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:17,506 INFO [train.py:1039] (2/4) Epoch 18, batch 2700, loss[loss=0.1764, simple_loss=0.2641, pruned_loss=0.0444, over 24463.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2567, pruned_loss=0.05323, over 4725626.83 frames. ], batch size: 69, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:07:19,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:19,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 06:07:20,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:07:23,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=620040.0, ans=0.125 2023-09-30 06:07:24,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:07:26,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:07:26,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:07:28,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:28,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:07:29,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:07:29,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 06:07:29,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:07:32,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:07:33,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:07:35,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:38,509 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.935e+02 2.170e+02 2.390e+02 3.266e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 06:07:38,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:07:40,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 06:07:40,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:07:40,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=620106.6666666666, ans=0.125 2023-09-30 06:07:47,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:07:47,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:07:52,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:07:53,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:53,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:07:53,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:07:55,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:01,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:01,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:08:01,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:04,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:04,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:08:10,782 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=620240.0, ans=0.0 2023-09-30 06:08:12,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=620240.0, ans=0.035 2023-09-30 06:08:14,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:08:16,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:08:18,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:08:18,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:21,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:22,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:22,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:25,800 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:27,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:27,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:08:30,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:08:32,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:32,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:35,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 06:08:36,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:39,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:08:39,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 06:08:41,309 INFO [train.py:1039] (2/4) Epoch 18, batch 2750, loss[loss=0.1777, simple_loss=0.2529, pruned_loss=0.05121, over 24660.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2556, pruned_loss=0.0525, over 4729461.69 frames. ], batch size: 65, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:08:41,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 06:08:42,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:45,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:08:46,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:49,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:49,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:08:49,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:51,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.14 vs. limit=22.5 2023-09-30 06:08:52,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:08:54,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:08:55,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:08:55,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:55,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 06:08:55,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:55,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:56,285 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.65 vs. limit=15.0 2023-09-30 06:09:01,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 06:09:03,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:09:03,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:05,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:07,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:09:07,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:09:07,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:09:08,584 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.40 vs. limit=8.0 2023-09-30 06:09:09,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:10,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:15,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:09:15,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:09:15,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:09:17,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:18,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:09:25,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:28,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:09:29,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:32,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:32,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:09:32,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:09:34,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=620573.3333333334, ans=0.125 2023-09-30 06:09:37,800 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=620573.3333333334, ans=0.09899494936611666 2023-09-30 06:09:39,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:09:41,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:41,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 06:09:46,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:47,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 06:09:53,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:09:54,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:09:56,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 06:09:57,785 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.25 vs. limit=15.0 2023-09-30 06:09:58,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:09:58,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=620640.0, ans=0.1 2023-09-30 06:09:59,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:09:59,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 06:09:59,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:10:02,952 INFO [train.py:1039] (2/4) Epoch 18, batch 2800, loss[loss=0.1898, simple_loss=0.253, pruned_loss=0.06333, over 23790.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2551, pruned_loss=0.0523, over 4729246.46 frames. ], batch size: 179, lr: 5.73e-03, grad_scale: 32.0 2023-09-30 06:10:03,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:10:03,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:03,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:04,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 06:10:04,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:06,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:06,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:07,817 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 06:10:07,818 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 06:10:11,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:13,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:10:13,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:10:18,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:10:20,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 06:10:21,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:10:23,090 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.862e+02 2.010e+02 2.282e+02 3.813e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-30 06:10:23,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 06:10:24,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:25,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:10:25,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:30,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:30,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:30,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:10:31,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:10:40,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:10:41,592 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.26 vs. limit=15.0 2023-09-30 06:10:42,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:45,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:47,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:10:47,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:49,611 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.62 vs. limit=15.0 2023-09-30 06:10:49,697 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.84 vs. limit=15.0 2023-09-30 06:10:54,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:10:54,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 06:10:55,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:55,820 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.88 vs. limit=15.0 2023-09-30 06:10:56,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:56,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:11:01,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:01,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:03,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.03 vs. limit=22.5 2023-09-30 06:11:04,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:11:06,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:11:06,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:06,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:11:07,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:11:09,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:11:10,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:11:10,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 06:11:10,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:12,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:11:12,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:15,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 06:11:15,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:15,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:11:17,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:11:19,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.16 vs. limit=15.0 2023-09-30 06:11:20,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 06:11:25,718 INFO [train.py:1039] (2/4) Epoch 18, batch 2850, loss[loss=0.1639, simple_loss=0.2409, pruned_loss=0.04345, over 24613.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2538, pruned_loss=0.05219, over 4717965.91 frames. ], batch size: 60, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:11:27,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:11:27,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:11:28,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:11:30,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:34,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:11:34,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:11:36,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:39,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:40,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:42,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:11:42,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 06:11:42,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=621106.6666666666, ans=10.0 2023-09-30 06:11:48,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 06:11:48,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:50,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 06:11:50,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:55,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 06:11:55,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 06:11:56,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:03,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=621173.3333333334, ans=0.0 2023-09-30 06:12:11,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:12,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:12,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:12:14,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:12:14,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:12:14,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:12:17,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:12:17,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 06:12:19,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:12:19,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:19,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:19,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:22,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:22,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:23,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:25,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:28,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:12:28,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:28,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:31,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:12:34,676 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.79 vs. limit=22.5 2023-09-30 06:12:36,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=621306.6666666666, ans=0.125 2023-09-30 06:12:37,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:12:40,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 06:12:40,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 06:12:42,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:12:43,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:43,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 06:12:44,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:12:44,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:46,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:46,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:12:46,118 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 06:12:47,635 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 06:12:47,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:12:47,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:49,187 INFO [train.py:1039] (2/4) Epoch 18, batch 2900, loss[loss=0.1719, simple_loss=0.2497, pruned_loss=0.04708, over 24334.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2531, pruned_loss=0.0525, over 4701235.03 frames. ], batch size: 61, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:12:52,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:12:53,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:53,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:55,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 06:12:57,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=621373.3333333334, ans=0.125 2023-09-30 06:12:58,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:58,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 06:13:00,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 06:13:01,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:13:01,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:13:03,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:05,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:13:08,431 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.787e+02 2.109e+02 2.458e+02 3.664e+02, threshold=4.218e+02, percent-clipped=0.0 2023-09-30 06:13:10,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:13:10,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:13:13,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:13:13,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 06:13:14,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:13:16,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:17,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 06:13:19,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 06:13:22,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:13:22,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 06:13:22,257 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:13:25,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:13:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:13:28,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:28,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:30,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=621506.6666666666, ans=0.1 2023-09-30 06:13:33,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:13:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:13:37,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 06:13:37,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 06:13:37,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:13:43,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:13:48,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 06:13:50,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:13:54,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:14:02,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:14:02,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:14:03,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 06:14:08,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:08,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 06:14:08,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:08,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:14:10,964 INFO [train.py:1039] (2/4) Epoch 18, batch 2950, loss[loss=0.1904, simple_loss=0.2521, pruned_loss=0.06435, over 23558.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2541, pruned_loss=0.05226, over 4720826.93 frames. ], batch size: 256, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:14:14,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:16,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 06:14:18,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:20,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:20,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:14:22,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:14:22,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=621706.6666666666, ans=0.2 2023-09-30 06:14:23,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 06:14:25,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 06:14:25,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:14:25,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:32,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:33,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:14:36,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:14:38,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:41,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:14:41,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:14:44,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:14:46,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 06:14:53,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 06:14:53,315 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 06:14:54,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:14:56,269 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 06:14:58,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 06:14:58,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:58,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:58,515 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 06:14:58,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:15:03,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 06:15:03,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:15:03,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:15:06,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:08,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:15:08,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:08,564 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 06:15:09,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:10,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 06:15:10,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=621906.6666666666, ans=0.0 2023-09-30 06:15:14,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:16,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:15:16,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 06:15:16,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:15:18,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=621973.3333333334, ans=0.0 2023-09-30 06:15:19,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 06:15:22,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:25,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:15:25,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:15:28,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:28,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:15:29,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:15:31,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:31,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:15:31,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:15:33,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:34,752 INFO [train.py:1039] (2/4) Epoch 18, batch 3000, loss[loss=0.1917, simple_loss=0.2626, pruned_loss=0.06043, over 23725.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2544, pruned_loss=0.05171, over 4726690.44 frames. ], batch size: 232, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:15:34,752 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 06:15:49,338 INFO [train.py:1071] (2/4) Epoch 18, validation: loss=0.3403, simple_loss=0.2856, pruned_loss=0.1975, over 1125622.00 frames. 2023-09-30 06:15:49,339 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 06:15:49,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:15:51,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:51,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 06:15:52,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:56,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:15:56,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:16:01,465 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 06:16:01,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 06:16:03,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:16:03,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:16:03,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 06:16:04,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:05,405 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.50 vs. limit=12.0 2023-09-30 06:16:09,646 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.852e+02 2.145e+02 2.482e+02 3.954e+02, threshold=4.290e+02, percent-clipped=0.0 2023-09-30 06:16:11,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=622106.6666666666, ans=0.125 2023-09-30 06:16:11,604 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=622106.6666666666, ans=0.125 2023-09-30 06:16:12,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:16:18,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=622106.6666666666, ans=0.125 2023-09-30 06:16:22,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:16:25,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=622173.3333333334, ans=0.125 2023-09-30 06:16:28,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 06:16:31,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:16:34,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:16:36,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:36,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:16:36,967 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.27 vs. limit=22.5 2023-09-30 06:16:37,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:37,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 06:16:37,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 06:16:41,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:16:41,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:16:44,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:16:44,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:44,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:16:44,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:16:49,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:16:49,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:49,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:16:51,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:53,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 06:16:54,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:16:54,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:16:56,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:16:59,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:16:59,810 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.46 vs. limit=15.0 2023-09-30 06:17:00,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:02,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:17:02,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 06:17:02,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:02,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 06:17:04,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:17:07,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 06:17:10,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:12,317 INFO [train.py:1039] (2/4) Epoch 18, batch 3050, loss[loss=0.1885, simple_loss=0.2754, pruned_loss=0.05083, over 24290.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2556, pruned_loss=0.05215, over 4725745.40 frames. ], batch size: 74, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:17:12,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:17:12,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 06:17:12,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=622373.3333333334, ans=0.0 2023-09-30 06:17:13,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 06:17:13,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:17:15,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:17:15,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:15,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:17:16,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:16,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:17:20,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 06:17:22,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:17:25,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:25,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:17:30,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:32,150 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=622440.0, ans=0.05 2023-09-30 06:17:32,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=622440.0, ans=0.125 2023-09-30 06:17:33,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 06:17:40,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 06:17:40,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 06:17:41,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:17:43,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:17:44,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=622506.6666666666, ans=0.0 2023-09-30 06:17:47,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:48,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:48,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:53,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:17:53,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:53,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:54,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:54,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:56,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:58,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:02,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:02,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 06:18:02,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:18:03,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:18:05,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:18:05,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:18:05,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:06,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:09,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.64 vs. limit=10.0 2023-09-30 06:18:10,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:18:10,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:17,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:17,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:18:17,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:22,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:22,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:18:22,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:18:23,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 06:18:23,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=622640.0, ans=0.125 2023-09-30 06:18:25,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:25,623 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:18:26,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:27,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 06:18:29,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:34,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:34,714 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=622706.6666666666, ans=0.1 2023-09-30 06:18:35,978 INFO [train.py:1039] (2/4) Epoch 18, batch 3100, loss[loss=0.1872, simple_loss=0.2753, pruned_loss=0.04954, over 24421.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2558, pruned_loss=0.05238, over 4712734.93 frames. ], batch size: 69, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:18:37,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:18:39,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:18:40,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 06:18:45,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 06:18:47,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 06:18:49,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:18:52,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:52,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:52,897 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=622773.3333333334, ans=0.0 2023-09-30 06:18:54,644 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.55 vs. limit=15.0 2023-09-30 06:18:55,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:18:56,073 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=622773.3333333334, ans=0.125 2023-09-30 06:18:57,055 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.806e+02 2.048e+02 2.293e+02 3.321e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 06:18:58,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:59,023 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=622773.3333333334, ans=0.1 2023-09-30 06:19:04,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 06:19:08,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:19:08,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:08,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:09,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:19:09,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=622840.0, ans=0.125 2023-09-30 06:19:10,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:19:13,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:19:13,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 06:19:13,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:19:13,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:17,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 06:19:18,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:19:22,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:19:23,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 06:19:24,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 06:19:25,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:25,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:28,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:28,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:28,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:19:30,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:19:30,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:19:33,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:19:33,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:19:33,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:33,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:19:39,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:40,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 06:19:43,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:19:43,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 06:19:45,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:45,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:45,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 06:19:57,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 06:19:58,738 INFO [train.py:1039] (2/4) Epoch 18, batch 3150, loss[loss=0.1777, simple_loss=0.2537, pruned_loss=0.05081, over 23406.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2538, pruned_loss=0.05224, over 4715294.04 frames. ], batch size: 93, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:20:00,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:00,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:03,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:20:03,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:20:05,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 06:20:05,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:05,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:20:06,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 06:20:07,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.61 vs. limit=12.0 2023-09-30 06:20:08,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:09,877 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 06:20:10,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=623040.0, ans=0.0 2023-09-30 06:20:14,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 06:20:14,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:20:16,413 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 06:20:16,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=623106.6666666666, ans=0.2 2023-09-30 06:20:18,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:20:18,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 06:20:20,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 06:20:20,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 06:20:20,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:20,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:21,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:21,955 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=623106.6666666666, ans=0.1 2023-09-30 06:20:23,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 06:20:27,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:28,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.08 vs. limit=10.0 2023-09-30 06:20:28,449 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.57 vs. limit=15.0 2023-09-30 06:20:30,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:20:33,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 06:20:33,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:20:36,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:20:38,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:38,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 06:20:41,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 06:20:43,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:20:43,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:20:43,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:20:44,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:44,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:20:44,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:20:46,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:20:47,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 06:20:49,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:20:49,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:52,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:20:52,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:52,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 06:20:54,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:20:56,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 06:20:56,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:58,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 06:20:58,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 06:21:01,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:21:01,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:04,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 06:21:04,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 06:21:05,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:21:08,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:21:10,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:10,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:21:15,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:21:16,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:18,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 06:21:21,423 INFO [train.py:1039] (2/4) Epoch 18, batch 3200, loss[loss=0.1895, simple_loss=0.2571, pruned_loss=0.06099, over 23822.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2536, pruned_loss=0.052, over 4719733.39 frames. ], batch size: 164, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:21:23,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:21:23,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:21:28,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:30,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:21:30,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 06:21:34,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:36,229 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.12 vs. limit=22.5 2023-09-30 06:21:39,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:21:42,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:43,652 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.425e+02 1.833e+02 1.996e+02 2.319e+02 3.127e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-30 06:21:50,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:21:59,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 06:22:01,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:22:04,201 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.55 vs. limit=22.5 2023-09-30 06:22:04,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 06:22:04,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:22:08,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:22:08,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:22:10,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:22:15,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 06:22:17,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:22:17,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=623573.3333333334, ans=0.125 2023-09-30 06:22:20,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 06:22:21,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 06:22:24,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:22:32,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:22:32,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,937 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 06:22:32,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:22:37,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:22:39,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 06:22:39,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=623640.0, ans=0.2 2023-09-30 06:22:41,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 06:22:41,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 06:22:43,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 06:22:44,458 INFO [train.py:1039] (2/4) Epoch 18, batch 3250, loss[loss=0.1768, simple_loss=0.2544, pruned_loss=0.04962, over 24645.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2535, pruned_loss=0.05203, over 4714214.95 frames. ], batch size: 65, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:22:44,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:22:48,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:22:48,468 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 06:22:48,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:22:48,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:22:50,015 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 06:22:53,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:22:56,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:23:05,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:05,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 06:23:07,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:07,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:23:07,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:08,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:08,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:23:13,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:23:13,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:13,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:23:17,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:19,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:21,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:21,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:22,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:24,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:24,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:28,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 06:23:30,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:23:30,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:23:32,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:32,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:23:35,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=623906.6666666666, ans=0.0 2023-09-30 06:23:37,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:23:44,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=623906.6666666666, ans=0.0 2023-09-30 06:23:46,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:23:46,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:46,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 06:23:46,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:23:46,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:23:46,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:50,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 06:23:51,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 06:23:51,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:54,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:55,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:56,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:23:57,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:24:00,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:00,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:02,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 06:24:03,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:04,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=623973.3333333334, ans=0.2 2023-09-30 06:24:04,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.08 vs. limit=15.0 2023-09-30 06:24:06,739 INFO [train.py:1039] (2/4) Epoch 18, batch 3300, loss[loss=0.2468, simple_loss=0.3014, pruned_loss=0.09608, over 19686.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2549, pruned_loss=0.05274, over 4708477.92 frames. ], batch size: 388, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:24:06,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:24:06,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 06:24:09,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:24:09,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 06:24:11,556 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=624040.0, ans=0.125 2023-09-30 06:24:12,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 06:24:12,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 06:24:12,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:18,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:19,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:24:19,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:22,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:24:22,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:24:26,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:26,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:30,040 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.896e+02 2.095e+02 2.468e+02 4.456e+02, threshold=4.189e+02, percent-clipped=2.0 2023-09-30 06:24:33,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 06:24:33,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:24:33,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:35,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:36,791 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 06:24:36,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:24:36,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:24:37,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=624106.6666666666, ans=0.125 2023-09-30 06:24:38,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:24:38,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:24:38,533 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 06:24:43,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:43,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:24:44,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:44,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 06:24:46,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 06:24:46,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:47,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:24:50,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 06:24:51,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 06:24:52,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:24:54,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 06:24:56,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:24:59,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:24:59,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:25:03,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:03,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:03,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:25:03,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:25:07,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:25:07,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:07,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:25:08,817 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 06:25:08,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 06:25:12,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:25:12,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:12,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:15,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:15,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:16,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:25:16,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:16,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:25:18,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:19,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:25:21,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 06:25:22,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:22,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:23,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:25:25,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:25:26,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:28,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:28,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:30,142 INFO [train.py:1039] (2/4) Epoch 18, batch 3350, loss[loss=0.2569, simple_loss=0.3077, pruned_loss=0.103, over 18939.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2563, pruned_loss=0.05384, over 4706208.11 frames. ], batch size: 388, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:25:33,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:25:35,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:35,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:25:39,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:40,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:25:44,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:44,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:25:45,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 06:25:47,255 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 06:25:47,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:51,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 06:25:51,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 06:25:53,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:25:53,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:25:53,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=624440.0, ans=0.125 2023-09-30 06:25:54,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:56,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 06:25:56,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:56,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:25:57,935 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.34 vs. limit=15.0 2023-09-30 06:25:59,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:01,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:02,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:02,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:26:05,283 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.43 vs. limit=6.0 2023-09-30 06:26:06,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:09,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:09,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:13,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:26:15,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:16,598 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.74 vs. limit=6.0 2023-09-30 06:26:17,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:18,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:20,907 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.28 vs. limit=22.5 2023-09-30 06:26:21,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:23,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 06:26:24,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:26:24,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 06:26:24,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:26:25,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=624573.3333333334, ans=0.125 2023-09-30 06:26:27,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 06:26:27,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:29,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:37,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:39,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 06:26:39,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:26:40,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:26:42,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:26:46,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=624640.0, ans=0.125 2023-09-30 06:26:47,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:26:51,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 06:26:51,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:26:51,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:26:53,299 INFO [train.py:1039] (2/4) Epoch 18, batch 3400, loss[loss=0.174, simple_loss=0.2366, pruned_loss=0.05573, over 23571.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2567, pruned_loss=0.05389, over 4719624.71 frames. ], batch size: 256, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:26:53,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:53,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 06:26:54,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:54,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 06:26:56,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:26:58,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:26:58,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 06:27:02,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 06:27:02,809 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 06:27:02,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:07,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:27:07,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:27:09,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:09,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=624773.3333333334, ans=0.125 2023-09-30 06:27:09,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=624773.3333333334, ans=0.125 2023-09-30 06:27:10,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:27:16,033 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.809e+02 2.054e+02 2.312e+02 3.383e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 06:27:16,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:16,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=624773.3333333334, ans=0.1 2023-09-30 06:27:17,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 06:27:22,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:27:24,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:24,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:27,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:27:35,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:27:38,472 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:27:39,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 06:27:45,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:47,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:48,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 06:27:48,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:27:50,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:50,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:50,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:27:53,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:58,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:28:00,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:28:04,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:06,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 06:28:13,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:28:14,625 INFO [train.py:1039] (2/4) Epoch 18, batch 3450, loss[loss=0.1552, simple_loss=0.2145, pruned_loss=0.04794, over 22591.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2569, pruned_loss=0.05419, over 4712198.31 frames. ], batch size: 322, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:28:17,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 06:28:21,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 06:28:21,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:28:23,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:28:23,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 06:28:23,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:27,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:28:30,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=625106.6666666666, ans=0.125 2023-09-30 06:28:32,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:28:32,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:34,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:28:34,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:37,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:42,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 06:28:47,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=625173.3333333334, ans=0.2 2023-09-30 06:28:48,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 06:28:48,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:28:48,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:28:52,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:57,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 06:28:59,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:29:04,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:05,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:29:07,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:29:08,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:29:11,302 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.38 vs. limit=15.0 2023-09-30 06:29:12,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 06:29:12,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:12,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:29:16,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:29:18,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 06:29:23,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:29:28,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:29:28,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:33,566 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:35,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=625306.6666666666, ans=0.07 2023-09-30 06:29:37,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:37,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:38,606 INFO [train.py:1039] (2/4) Epoch 18, batch 3500, loss[loss=0.1804, simple_loss=0.2714, pruned_loss=0.04467, over 24678.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2564, pruned_loss=0.05339, over 4732655.41 frames. ], batch size: 73, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:29:38,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:29:40,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:45,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:47,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:29:48,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 06:29:50,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:29:53,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:29:54,394 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.02 vs. limit=10.0 2023-09-30 06:29:55,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:55,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 06:30:01,130 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.444e+02 1.872e+02 2.058e+02 2.368e+02 3.255e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 06:30:01,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:30:01,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:30:03,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:30:03,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:03,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:30:03,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:05,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:05,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 06:30:09,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:10,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:30:12,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:14,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:16,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 06:30:16,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:17,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:18,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=625506.6666666666, ans=0.07 2023-09-30 06:30:20,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:30:22,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:24,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:30:24,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:25,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 06:30:27,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 06:30:27,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 06:30:29,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:30,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:30,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:30,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:30:31,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=625573.3333333334, ans=0.125 2023-09-30 06:30:33,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:30:34,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:30:41,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:30:42,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 06:30:42,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 06:30:42,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:30:46,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:30:46,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:46,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:49,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 06:30:50,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:53,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:53,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 06:30:55,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=625640.0, ans=0.125 2023-09-30 06:30:56,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 06:30:59,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:00,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:31:00,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:01,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:02,362 INFO [train.py:1039] (2/4) Epoch 18, batch 3550, loss[loss=0.1817, simple_loss=0.2713, pruned_loss=0.04604, over 24429.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2549, pruned_loss=0.05247, over 4726905.71 frames. ], batch size: 69, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:31:04,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:31:05,054 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.22 vs. limit=22.5 2023-09-30 06:31:12,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:16,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:31:16,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=625706.6666666666, ans=0.0 2023-09-30 06:31:16,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=625706.6666666666, ans=0.125 2023-09-30 06:31:17,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:19,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:31:20,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:22,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:31:22,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:31:27,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:27,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:31:27,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:29,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:31:29,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:31:36,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:31:36,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:36,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=625840.0, ans=0.015 2023-09-30 06:31:38,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:38,218 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:38,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:31:38,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 06:31:38,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:40,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:41,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:31:48,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:50,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:50,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:51,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 06:31:54,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:31:55,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 06:31:57,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:58,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:31:58,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:31:59,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=625906.6666666666, ans=0.125 2023-09-30 06:32:02,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 06:32:04,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:04,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=625906.6666666666, ans=0.2 2023-09-30 06:32:07,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:09,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 06:32:09,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:15,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:32:16,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 06:32:17,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=625973.3333333334, ans=0.0 2023-09-30 06:32:19,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=625973.3333333334, ans=0.125 2023-09-30 06:32:23,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 06:32:23,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:32:24,963 INFO [train.py:1039] (2/4) Epoch 18, batch 3600, loss[loss=0.1734, simple_loss=0.2638, pruned_loss=0.04151, over 24651.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2543, pruned_loss=0.0519, over 4729074.94 frames. ], batch size: 73, lr: 5.70e-03, grad_scale: 32.0 2023-09-30 06:32:25,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:32:27,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:28,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:30,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:32:35,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:35,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:37,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:32:37,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:32:38,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:38,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 06:32:43,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:32:43,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:46,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:47,988 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.865e+02 1.967e+02 2.260e+02 3.686e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 06:32:49,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:32:51,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:32:52,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:52,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 06:32:52,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=626106.6666666666, ans=0.125 2023-09-30 06:32:54,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:56,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:58,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:32:59,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:01,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:33:01,934 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-09-30 06:33:03,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:03,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 06:33:11,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=626173.3333333334, ans=0.125 2023-09-30 06:33:13,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:14,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:33:16,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 06:33:21,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:33:25,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:27,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:31,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=626306.6666666666, ans=0.125 2023-09-30 06:33:34,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:33:34,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:33:34,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 06:33:36,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 06:33:38,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 06:33:41,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:41,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:33:42,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 06:33:44,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:33:44,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:33:44,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:45,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 06:33:47,875 INFO [train.py:1039] (2/4) Epoch 18, batch 3650, loss[loss=0.1736, simple_loss=0.2535, pruned_loss=0.0468, over 24490.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2548, pruned_loss=0.05194, over 4732205.79 frames. ], batch size: 63, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:33:47,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 06:33:49,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:51,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 06:33:53,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=626373.3333333334, ans=0.1 2023-09-30 06:33:55,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 06:33:57,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:33:57,944 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=626373.3333333334, ans=0.125 2023-09-30 06:34:00,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 06:34:02,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 06:34:07,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:07,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:34:08,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:34:13,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:34:13,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:34:14,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 06:34:14,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:34:14,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:14,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 06:34:15,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=626440.0, ans=0.1 2023-09-30 06:34:16,384 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=626440.0, ans=0.09899494936611666 2023-09-30 06:34:17,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:34:19,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:34:19,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:20,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:34:24,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 06:34:24,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 06:34:25,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:34:28,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 06:34:28,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:28,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:34:29,121 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=626506.6666666666, ans=0.02 2023-09-30 06:34:36,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:34:38,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:38,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:34:40,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:34:40,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:34:41,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:34:45,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:45,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:45,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:45,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=626573.3333333334, ans=0.2 2023-09-30 06:34:49,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:34:50,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:50,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:34:52,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=626640.0, ans=0.125 2023-09-30 06:34:58,826 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 06:35:03,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:03,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:03,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:35:05,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:05,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:35:06,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:07,486 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=44.29 vs. limit=15.0 2023-09-30 06:35:08,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 06:35:08,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:09,637 INFO [train.py:1039] (2/4) Epoch 18, batch 3700, loss[loss=0.2014, simple_loss=0.2604, pruned_loss=0.07115, over 23779.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2562, pruned_loss=0.05298, over 4733392.01 frames. ], batch size: 179, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:35:11,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:35:14,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:35:14,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:35:17,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=626706.6666666666, ans=0.125 2023-09-30 06:35:18,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:18,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 06:35:18,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:20,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:35:20,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:35:20,783 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.56 vs. limit=15.0 2023-09-30 06:35:24,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:35:28,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:28,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:29,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:35:29,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:31,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:35:34,543 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.984e+02 2.156e+02 2.490e+02 5.109e+02, threshold=4.311e+02, percent-clipped=1.0 2023-09-30 06:35:34,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:34,883 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 06:35:39,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=626773.3333333334, ans=0.125 2023-09-30 06:35:42,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:35:42,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:35:45,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:35:45,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 06:35:45,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:35:49,805 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.43 vs. limit=15.0 2023-09-30 06:35:50,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:52,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 06:35:54,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:54,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:35:56,192 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=626840.0, ans=0.125 2023-09-30 06:35:57,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:57,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:35:59,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:36:04,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:06,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 06:36:06,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:06,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 06:36:08,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=626906.6666666666, ans=0.0 2023-09-30 06:36:10,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:36:12,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:36:14,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:15,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 06:36:17,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:36:17,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:36:18,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:18,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:22,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:23,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 06:36:25,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 06:36:25,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:36:25,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:27,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:36:28,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:36:30,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:36:30,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=627040.0, ans=0.125 2023-09-30 06:36:31,896 INFO [train.py:1039] (2/4) Epoch 18, batch 3750, loss[loss=0.1794, simple_loss=0.2634, pruned_loss=0.04777, over 23871.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2566, pruned_loss=0.05326, over 4728722.63 frames. ], batch size: 86, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:36:32,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:36:34,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:36:34,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=627040.0, ans=0.0 2023-09-30 06:36:35,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 06:36:37,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:36:39,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=627040.0, ans=0.125 2023-09-30 06:36:39,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=627040.0, ans=0.125 2023-09-30 06:36:40,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:36:41,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 06:36:42,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:36:43,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:45,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:46,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:36:50,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:51,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:53,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:36:57,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:37:00,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:00,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 06:37:01,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:03,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:04,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:37:08,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 06:37:11,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 06:37:11,989 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=627173.3333333334, ans=0.125 2023-09-30 06:37:13,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:14,572 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.08 vs. limit=15.0 2023-09-30 06:37:15,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:15,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.82 vs. limit=10.0 2023-09-30 06:37:16,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:21,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:24,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:37:28,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 06:37:31,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:34,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:37:34,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:37:39,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:37:44,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:37:45,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:37:48,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:37:49,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:37:51,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:37:54,291 INFO [train.py:1039] (2/4) Epoch 18, batch 3800, loss[loss=0.1783, simple_loss=0.2432, pruned_loss=0.0567, over 23727.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2563, pruned_loss=0.05295, over 4733690.73 frames. ], batch size: 212, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:37:59,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:38:01,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.24 vs. limit=10.0 2023-09-30 06:38:03,312 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=627373.3333333334, ans=0.125 2023-09-30 06:38:04,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:06,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:38:06,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 06:38:07,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:10,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:10,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:38:13,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 06:38:13,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:14,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:38:15,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:15,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:38:17,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:17,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 06:38:19,112 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.827e+02 2.039e+02 2.367e+02 3.749e+02, threshold=4.078e+02, percent-clipped=0.0 2023-09-30 06:38:20,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 06:38:22,985 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:38:25,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:27,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:38:29,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:38:29,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.53 vs. limit=15.0 2023-09-30 06:38:30,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:38:30,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:32,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:33,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:38:40,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 06:38:42,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:38:48,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:38:55,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:38:57,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 06:39:00,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 06:39:00,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=627640.0, ans=0.0 2023-09-30 06:39:01,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:03,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:39:05,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:05,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 06:39:08,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 06:39:08,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 06:39:08,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:11,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:39:16,696 INFO [train.py:1039] (2/4) Epoch 18, batch 3850, loss[loss=0.1859, simple_loss=0.2659, pruned_loss=0.05293, over 24450.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2552, pruned_loss=0.05285, over 4722398.13 frames. ], batch size: 69, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:39:18,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:39:18,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:39:21,969 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=627706.6666666666, ans=0.2 2023-09-30 06:39:23,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:39:24,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 06:39:24,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:39:26,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:29,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:39:32,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:32,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=627773.3333333334, ans=0.2 2023-09-30 06:39:35,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:39:37,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 06:39:37,878 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.63 vs. limit=15.0 2023-09-30 06:39:42,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:45,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:49,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:39:49,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:39:52,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:52,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:39:53,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:53,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:39:53,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:55,877 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=627840.0, ans=0.125 2023-09-30 06:39:56,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:57,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:58,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:39:58,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 06:39:58,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 06:40:00,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:00,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:05,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 06:40:08,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 06:40:10,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:11,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 06:40:14,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:40:19,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:21,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:24,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:25,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 06:40:28,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 06:40:30,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:31,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:34,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:40:34,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:40:35,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:40:37,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 06:40:37,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:38,625 INFO [train.py:1039] (2/4) Epoch 18, batch 3900, loss[loss=0.176, simple_loss=0.2608, pruned_loss=0.04567, over 24563.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2552, pruned_loss=0.05265, over 4735092.23 frames. ], batch size: 71, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:40:38,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 06:40:38,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:38,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:40,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:40:42,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:43,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:40:43,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:43,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:45,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:40:45,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 06:40:45,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:48,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:50,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:50,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:40:52,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:53,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:53,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:55,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:40:57,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 06:40:57,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:40:58,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 06:41:00,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:41:00,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 06:41:02,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 06:41:03,821 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.888e+02 2.025e+02 2.251e+02 3.863e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 06:41:08,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:08,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:41:08,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:41:10,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:16,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:18,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:41:21,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:41:21,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:23,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:41:28,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:41:28,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:41:37,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:41:39,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:41:39,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=628240.0, ans=0.125 2023-09-30 06:41:46,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=628306.6666666666, ans=0.125 2023-09-30 06:41:49,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:41:51,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:51,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 06:41:51,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=628306.6666666666, ans=0.125 2023-09-30 06:41:52,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 06:41:52,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:54,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 06:41:56,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:57,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 06:42:00,967 INFO [train.py:1039] (2/4) Epoch 18, batch 3950, loss[loss=0.1827, simple_loss=0.2503, pruned_loss=0.05755, over 23617.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2547, pruned_loss=0.05234, over 4740456.28 frames. ], batch size: 256, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:42:04,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:42:06,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 06:42:06,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:42:09,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:42:09,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:42:13,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=628373.3333333334, ans=0.1 2023-09-30 06:42:16,479 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 06:42:17,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:18,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 06:42:19,466 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 06:42:19,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:42:23,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:23,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:42:23,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:25,196 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=628440.0, ans=0.0 2023-09-30 06:42:26,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 06:42:28,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:42:28,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:28,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:42:28,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=628440.0, ans=0.125 2023-09-30 06:42:30,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:42:30,501 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=628440.0, ans=0.95 2023-09-30 06:42:31,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:42:35,147 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=628506.6666666666, ans=0.125 2023-09-30 06:42:44,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:42:44,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:42:49,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 06:42:54,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 06:42:54,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 06:42:55,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:42:56,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:43:04,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:43:05,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:43:05,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:06,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:43:06,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 06:43:11,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:43:12,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:43:15,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=628640.0, ans=0.0 2023-09-30 06:43:17,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 06:43:24,663 INFO [train.py:1039] (2/4) Epoch 18, batch 4000, loss[loss=0.1637, simple_loss=0.2434, pruned_loss=0.04202, over 18678.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2551, pruned_loss=0.05244, over 4737063.44 frames. ], batch size: 40, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:43:27,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:37,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:42,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:43,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:43:44,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:44,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 06:43:44,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:43:45,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 06:43:45,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:43:45,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 06:43:46,597 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.23 vs. limit=15.0 2023-09-30 06:43:47,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:48,590 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.924e+02 2.193e+02 2.601e+02 4.615e+02, threshold=4.387e+02, percent-clipped=1.0 2023-09-30 06:43:51,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:43:52,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:43:52,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:43:52,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:52,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:43:54,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:43:56,120 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 06:43:57,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:43:57,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:43:59,417 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 06:43:59,627 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:44:00,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:44:00,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:12,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 06:44:12,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:44:12,459 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=628906.6666666666, ans=0.04949747468305833 2023-09-30 06:44:14,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:44:15,904 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 06:44:16,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:44:16,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.72 vs. limit=12.0 2023-09-30 06:44:18,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 06:44:18,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:44:18,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:20,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:44:21,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:44:21,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:44:21,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:23,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 06:44:25,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:26,622 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 06:44:33,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:44:36,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:44:37,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:44:37,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:39,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:44:39,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:44:43,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:45,897 INFO [train.py:1039] (2/4) Epoch 18, batch 4050, loss[loss=0.1602, simple_loss=0.2347, pruned_loss=0.04282, over 24622.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2556, pruned_loss=0.05234, over 4752475.46 frames. ], batch size: 60, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:44:48,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:44:48,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 06:44:51,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:44:51,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:44:52,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:44:54,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:44:55,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:00,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:02,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:03,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:45:07,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:45:07,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:45:11,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:13,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:45:16,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 06:45:19,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 06:45:19,431 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 06:45:22,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:45:31,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 06:45:31,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:45:34,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:38,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:38,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:45:38,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:41,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:44,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 06:45:44,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:45:46,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:45:47,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 06:45:48,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=629240.0, ans=0.0 2023-09-30 06:45:48,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=629240.0, ans=0.125 2023-09-30 06:45:52,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:46:01,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 06:46:01,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:01,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:46:04,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 06:46:04,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 06:46:04,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:07,716 INFO [train.py:1039] (2/4) Epoch 18, batch 4100, loss[loss=0.2549, simple_loss=0.3089, pruned_loss=0.1005, over 19242.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2571, pruned_loss=0.05365, over 4735499.77 frames. ], batch size: 388, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:46:07,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:11,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:11,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:46:11,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=629373.3333333334, ans=0.125 2023-09-30 06:46:11,906 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.68 vs. limit=6.0 2023-09-30 06:46:13,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-09-30 06:46:18,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 06:46:19,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 06:46:21,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 06:46:22,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 06:46:22,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:24,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:46:24,375 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 06:46:27,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:27,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:46:27,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:29,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:46:29,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=629440.0, ans=0.125 2023-09-30 06:46:33,111 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.931e+02 2.115e+02 2.277e+02 3.051e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-30 06:46:34,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:46:36,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:36,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:46:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 06:46:37,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:37,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:46:37,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:39,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:46:40,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 06:46:44,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:46:47,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 06:46:48,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:52,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:52,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 06:46:52,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:54,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:46:54,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:46:55,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 06:46:57,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:46:58,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:47:00,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 06:47:00,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:00,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:05,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:09,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:12,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:14,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:47:16,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=629640.0, ans=0.125 2023-09-30 06:47:23,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:23,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:26,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:29,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:47:30,956 INFO [train.py:1039] (2/4) Epoch 18, batch 4150, loss[loss=0.1949, simple_loss=0.2608, pruned_loss=0.06446, over 23799.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2566, pruned_loss=0.05348, over 4740940.81 frames. ], batch size: 164, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:47:32,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:34,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:47:35,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:47:35,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:47:38,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 06:47:38,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:38,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 06:47:40,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 06:47:40,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 06:47:42,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:47,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:47:47,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:52,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:54,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:47:56,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:47:57,799 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=629773.3333333334, ans=0.0 2023-09-30 06:47:59,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:47:59,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:48:00,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:48:02,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:05,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:05,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 06:48:09,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 06:48:09,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:48:11,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 06:48:11,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:48:11,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:13,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=629840.0, ans=0.125 2023-09-30 06:48:15,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:17,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:20,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 06:48:23,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:24,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=629906.6666666666, ans=0.125 2023-09-30 06:48:25,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:48:26,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 06:48:29,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:30,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 06:48:33,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:48:33,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:35,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:36,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 06:48:36,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:48:36,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:48:38,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:48:41,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 06:48:41,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:41,319 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:48:41,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:48:42,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 06:48:43,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:43,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:48:44,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:46,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:46,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 06:48:46,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:53,164 INFO [train.py:1039] (2/4) Epoch 18, batch 4200, loss[loss=0.1798, simple_loss=0.2592, pruned_loss=0.05022, over 24509.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2556, pruned_loss=0.05322, over 4739695.39 frames. ], batch size: 63, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:48:53,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:48:56,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 06:48:56,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:49:00,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:02,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:49:03,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:03,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:05,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 06:49:07,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 06:49:07,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:07,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=630040.0, ans=0.125 2023-09-30 06:49:08,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:11,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:49:13,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:49:15,372 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.62 vs. limit=12.0 2023-09-30 06:49:16,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:17,440 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.914e+02 2.122e+02 2.477e+02 4.078e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 06:49:17,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:17,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 06:49:17,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:17,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=630106.6666666666, ans=0.0 2023-09-30 06:49:19,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:19,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:19,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:49:21,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:49:25,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 06:49:26,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:29,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:49:31,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:49:33,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:49:33,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:49:34,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=630173.3333333334, ans=0.125 2023-09-30 06:49:35,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=630173.3333333334, ans=0.0 2023-09-30 06:49:37,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:49:37,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 06:49:37,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:49:38,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:49:38,839 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:49:44,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:49:46,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:52,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:49:55,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 06:49:58,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:50:04,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:50:04,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:06,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 06:50:11,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:50:13,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=630306.6666666666, ans=0.0 2023-09-30 06:50:14,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:50:15,763 INFO [train.py:1039] (2/4) Epoch 18, batch 4250, loss[loss=0.17, simple_loss=0.2515, pruned_loss=0.04419, over 24634.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2548, pruned_loss=0.05258, over 4730918.05 frames. ], batch size: 65, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:50:15,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:50:18,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:20,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=630373.3333333334, ans=0.125 2023-09-30 06:50:25,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:50:25,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 06:50:26,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:50:28,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:32,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:36,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:37,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:39,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:50:39,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:50:40,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:42,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:42,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:44,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:50:45,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:50:47,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 06:50:50,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 06:50:51,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:53,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:53,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:53,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:50:53,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:54,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:58,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:50:59,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:51:03,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:04,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:07,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 06:51:07,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:51:08,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 06:51:10,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:51:12,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:51:15,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:15,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:51:18,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 06:51:20,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:51:21,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:51:25,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:27,030 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.75 vs. limit=15.0 2023-09-30 06:51:29,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:30,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:51:31,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=630640.0, ans=0.0 2023-09-30 06:51:32,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:32,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:33,551 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-09-30 06:51:34,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:51:36,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:51:36,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 06:51:37,586 INFO [train.py:1039] (2/4) Epoch 18, batch 4300, loss[loss=0.1869, simple_loss=0.2716, pruned_loss=0.05112, over 24406.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2533, pruned_loss=0.0524, over 4707875.26 frames. ], batch size: 77, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:51:37,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:39,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=630706.6666666666, ans=0.125 2023-09-30 06:51:42,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:42,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:51:47,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:56,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:56,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 06:51:56,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:51:59,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:51:59,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:51:59,298 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 06:52:02,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:52:03,679 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.954e+02 2.321e+02 2.799e+02 4.498e+02, threshold=4.642e+02, percent-clipped=1.0 2023-09-30 06:52:05,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:07,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 06:52:07,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:52:09,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 06:52:10,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:52:12,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:52:13,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:52:13,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:52:16,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:52:17,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:19,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:52:19,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 06:52:21,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 06:52:23,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:52:26,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:26,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:52:26,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:28,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:28,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 06:52:28,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 06:52:29,525 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 06:52:29,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:52:29,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 06:52:29,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=630906.6666666666, ans=0.0 2023-09-30 06:52:31,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 06:52:34,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:35,912 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 06:52:38,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:52:40,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:40,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:44,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 06:52:44,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:44,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:45,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:52:45,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:52:45,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:52:47,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:52:51,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=630973.3333333334, ans=0.1 2023-09-30 06:52:52,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:52,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:53,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:52:57,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=630973.3333333334, ans=0.125 2023-09-30 06:53:00,677 INFO [train.py:1039] (2/4) Epoch 18, batch 4350, loss[loss=0.1861, simple_loss=0.2637, pruned_loss=0.0542, over 23324.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2546, pruned_loss=0.05257, over 4714294.16 frames. ], batch size: 93, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:53:00,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 06:53:01,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=631040.0, ans=0.125 2023-09-30 06:53:02,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:53:05,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:10,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:10,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=631040.0, ans=0.125 2023-09-30 06:53:10,338 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=631040.0, ans=0.0 2023-09-30 06:53:12,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:53:12,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:53:16,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:53:20,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:21,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=631106.6666666666, ans=0.1 2023-09-30 06:53:23,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:53:23,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:53:26,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:53:28,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:53:29,000 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=15.0 2023-09-30 06:53:29,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:53:32,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=631173.3333333334, ans=0.0 2023-09-30 06:53:36,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 06:53:37,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:39,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:39,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=631173.3333333334, ans=0.125 2023-09-30 06:53:44,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:47,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 06:53:49,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:53:50,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:53:56,825 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 06:53:58,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:53:58,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:53:58,727 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=631240.0, ans=0.1 2023-09-30 06:53:59,905 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 06:54:01,847 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 06:54:01,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:03,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:04,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:54:04,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:06,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:08,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:10,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 06:54:10,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:10,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:12,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:12,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 06:54:13,639 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 06:54:13,645 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 06:54:13,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 06:54:18,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:54:18,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:54:18,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:19,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:54:21,161 INFO [train.py:1039] (2/4) Epoch 18, batch 4400, loss[loss=0.1533, simple_loss=0.226, pruned_loss=0.04026, over 24348.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2548, pruned_loss=0.05246, over 4724803.68 frames. ], batch size: 56, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:54:21,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 06:54:22,828 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 06:54:22,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:26,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:26,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:29,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:32,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 06:54:32,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 06:54:32,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 06:54:33,980 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 06:54:34,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:54:34,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:37,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 06:54:39,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:40,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:40,869 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 06:54:46,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:46,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 06:54:47,799 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 06:54:49,101 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.990e+02 2.241e+02 2.697e+02 4.171e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-30 06:54:50,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 06:54:50,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 06:54:52,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 06:54:52,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:52,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:53,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:55,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:57,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 06:54:57,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 06:54:58,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:00,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:55:00,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:02,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:03,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:03,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 06:55:05,276 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 06:55:07,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:07,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=631506.6666666666, ans=0.125 2023-09-30 06:55:09,406 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.26 vs. limit=22.5 2023-09-30 06:55:15,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:55:16,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 06:55:19,056 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:55:20,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:21,712 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.39 vs. limit=15.0 2023-09-30 06:55:24,698 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=631573.3333333334, ans=0.125 2023-09-30 06:55:25,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:55:25,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 06:55:25,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:55:27,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:55:27,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:55:28,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:55:33,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 06:55:36,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 06:55:38,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 06:55:38,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:38,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 06:55:40,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:55:41,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:55:43,107 INFO [train.py:1039] (2/4) Epoch 18, batch 4450, loss[loss=0.1644, simple_loss=0.2443, pruned_loss=0.04228, over 24464.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2553, pruned_loss=0.05268, over 4729988.42 frames. ], batch size: 66, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:55:43,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 06:55:48,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:50,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:50,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:55:51,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=631706.6666666666, ans=0.0 2023-09-30 06:55:56,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:55:56,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:55:59,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=631773.3333333334, ans=0.125 2023-09-30 06:56:00,943 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.84 vs. limit=15.0 2023-09-30 06:56:01,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:02,560 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.06 vs. limit=12.0 2023-09-30 06:56:03,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:56:05,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:56:05,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:07,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 06:56:07,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:07,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=631773.3333333334, ans=0.2 2023-09-30 06:56:08,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:10,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:10,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:56:13,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:56:17,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:19,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:21,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:21,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:24,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:56:27,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:56:29,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 06:56:31,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 06:56:31,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:56:34,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:34,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 06:56:39,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:56:42,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:44,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 06:56:44,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:44,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:56:44,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:56:44,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:47,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:49,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=631973.3333333334, ans=15.0 2023-09-30 06:56:50,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:56:51,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 06:56:53,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:56:57,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:57,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:57:00,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:00,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:57:02,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=631973.3333333334, ans=0.1 2023-09-30 06:57:03,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:57:04,350 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.38 vs. limit=10.0 2023-09-30 06:57:05,029 INFO [train.py:1039] (2/4) Epoch 18, batch 4500, loss[loss=0.1719, simple_loss=0.2586, pruned_loss=0.04261, over 24658.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2552, pruned_loss=0.05255, over 4723651.08 frames. ], batch size: 73, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:57:05,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 06:57:08,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:57:14,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:15,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 06:57:15,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 06:57:17,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:20,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:22,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:22,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:57:23,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:57:24,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:25,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:33,296 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.852e+02 2.175e+02 2.491e+02 3.622e+02, threshold=4.350e+02, percent-clipped=0.0 2023-09-30 06:57:38,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:38,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:57:41,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=632173.3333333334, ans=0.0 2023-09-30 06:57:42,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:57:42,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:57:44,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:57:47,529 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=632173.3333333334, ans=0.125 2023-09-30 06:57:51,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:57:55,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:57:58,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:58:01,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:58:01,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 06:58:02,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:02,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:03,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:05,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:58:08,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:58:08,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 06:58:08,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:58:08,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:15,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:58:15,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:58:17,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:21,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:58:21,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:58:24,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 06:58:25,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 06:58:25,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 06:58:28,736 INFO [train.py:1039] (2/4) Epoch 18, batch 4550, loss[loss=0.1765, simple_loss=0.264, pruned_loss=0.04452, over 24651.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2551, pruned_loss=0.05235, over 4718844.63 frames. ], batch size: 73, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:58:28,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 06:58:29,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=632373.3333333334, ans=0.125 2023-09-30 06:58:32,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 06:58:32,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:33,277 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.63 vs. limit=15.0 2023-09-30 06:58:36,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:36,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:40,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:45,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:58:47,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:47,428 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:58:48,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:58:50,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:58:50,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:52,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:54,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:56,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:58:57,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 06:58:59,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 06:58:59,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:59:00,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 06:59:03,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 06:59:04,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:06,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 06:59:07,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:59:07,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=632506.6666666666, ans=0.0 2023-09-30 06:59:12,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:12,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:12,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:59:15,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 06:59:18,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:21,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:21,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:23,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:25,764 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=632573.3333333334, ans=0.0 2023-09-30 06:59:26,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 06:59:28,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 06:59:28,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:59:28,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 06:59:33,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 06:59:33,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:33,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:34,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:59:34,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:36,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:59:37,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:59:38,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 06:59:39,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 06:59:41,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 06:59:41,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:59:41,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 06:59:43,613 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=632640.0, ans=0.0 2023-09-30 06:59:46,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:59:46,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:59:48,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:59:48,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:50,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:59:50,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:59:51,589 INFO [train.py:1039] (2/4) Epoch 18, batch 4600, loss[loss=0.1961, simple_loss=0.2656, pruned_loss=0.06336, over 23831.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2535, pruned_loss=0.05233, over 4696695.68 frames. ], batch size: 179, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:59:53,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:59:54,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:55,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.61 vs. limit=15.0 2023-09-30 06:59:56,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:00:01,632 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:00:01,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:00:01,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:03,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 07:00:05,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:00:08,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:00:10,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:11,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:18,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 07:00:18,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:20,158 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.823e+02 2.071e+02 2.370e+02 3.584e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 07:00:21,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:26,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:00:26,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:31,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 07:00:31,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:00:33,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:00:40,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:40,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:00:42,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:00:46,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 07:00:48,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:00:53,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:54,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:00:56,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=632973.3333333334, ans=0.0 2023-09-30 07:00:58,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:58,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 07:00:59,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:59,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 07:00:59,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:59,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:01,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:02,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:02,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:03,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 07:01:03,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 07:01:03,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 07:01:03,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:03,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:05,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:05,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:05,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=632973.3333333334, ans=0.1 2023-09-30 07:01:15,437 INFO [train.py:1039] (2/4) Epoch 18, batch 4650, loss[loss=0.1907, simple_loss=0.2591, pruned_loss=0.06117, over 23785.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2536, pruned_loss=0.05237, over 4697539.57 frames. ], batch size: 164, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:01:17,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:01:20,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:20,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:20,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:01:20,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:20,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:22,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:25,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 07:01:30,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:01:32,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 07:01:33,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:35,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 07:01:35,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:01:35,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 07:01:37,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 07:01:37,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:37,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:01:37,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=633106.6666666666, ans=0.1 2023-09-30 07:01:40,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:01:42,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:42,536 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 07:01:45,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:47,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 07:01:50,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:50,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:01:52,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 07:01:52,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:55,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:01:59,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:01,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=633173.3333333334, ans=0.04949747468305833 2023-09-30 07:02:03,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:06,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:02:08,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=633240.0, ans=0.0 2023-09-30 07:02:09,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 07:02:09,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 07:02:11,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 07:02:11,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 07:02:11,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=633240.0, ans=0.2 2023-09-30 07:02:14,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:14,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=633240.0, ans=0.125 2023-09-30 07:02:21,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:02:21,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:21,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 07:02:21,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:21,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=633306.6666666666, ans=0.125 2023-09-30 07:02:22,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:22,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:02:24,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:02:25,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:02:25,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:26,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:30,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:30,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:02:30,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:02:32,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:02:32,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:02:34,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 07:02:34,348 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=633306.6666666666, ans=0.0 2023-09-30 07:02:39,273 INFO [train.py:1039] (2/4) Epoch 18, batch 4700, loss[loss=0.1773, simple_loss=0.2416, pruned_loss=0.0565, over 23799.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2542, pruned_loss=0.05292, over 4695397.04 frames. ], batch size: 164, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:02:43,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:45,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:47,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:02:49,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:49,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:02:54,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 07:02:54,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 07:02:58,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:58,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:03:00,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:03:05,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:06,751 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.845e+02 2.033e+02 2.292e+02 3.478e+02, threshold=4.067e+02, percent-clipped=0.0 2023-09-30 07:03:11,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=633506.6666666666, ans=0.0 2023-09-30 07:03:13,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:03:15,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 07:03:18,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:03:18,978 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=22.5 2023-09-30 07:03:24,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 07:03:24,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:03:24,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=633506.6666666666, ans=0.2 2023-09-30 07:03:26,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:26,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=633573.3333333334, ans=0.0 2023-09-30 07:03:31,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 07:03:33,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:03:39,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:03:39,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 07:03:41,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:41,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:43,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:44,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:03:44,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 07:03:45,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=633640.0, ans=0.0 2023-09-30 07:03:46,445 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 07:03:48,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:48,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 07:03:50,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:51,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=633640.0, ans=0.05 2023-09-30 07:03:54,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 07:03:57,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:03:59,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:01,482 INFO [train.py:1039] (2/4) Epoch 18, batch 4750, loss[loss=0.1831, simple_loss=0.2714, pruned_loss=0.04739, over 24443.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2553, pruned_loss=0.05325, over 4704898.39 frames. ], batch size: 69, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:04:03,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:03,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:04:05,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 07:04:05,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:09,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 07:04:10,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:04:11,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:04:12,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:13,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=633706.6666666666, ans=0.0 2023-09-30 07:04:17,126 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.22 vs. limit=10.0 2023-09-30 07:04:20,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 07:04:24,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:04:27,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 07:04:27,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:30,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:33,040 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 07:04:33,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 07:04:34,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=633840.0, ans=0.2 2023-09-30 07:04:39,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 07:04:41,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:44,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:04:46,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:04:46,067 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 07:04:46,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:04:46,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=633840.0, ans=0.0 2023-09-30 07:04:49,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:04:51,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:04:54,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 07:04:54,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 07:04:54,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:55,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:04:55,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:58,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:04:58,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 07:04:59,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 07:05:04,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:09,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:05:09,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 07:05:10,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:12,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:14,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:05:14,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:16,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:05:20,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:20,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 07:05:22,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 07:05:22,586 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=634040.0, ans=0.125 2023-09-30 07:05:23,718 INFO [train.py:1039] (2/4) Epoch 18, batch 4800, loss[loss=0.1639, simple_loss=0.2416, pruned_loss=0.04307, over 24331.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2562, pruned_loss=0.05314, over 4718788.24 frames. ], batch size: 61, lr: 5.67e-03, grad_scale: 32.0 2023-09-30 07:05:23,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 07:05:25,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:05:25,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:27,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 07:05:33,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:33,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:37,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:05:40,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:40,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:42,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 07:05:42,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:42,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:05:44,806 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=634106.6666666666, ans=0.09899494936611666 2023-09-30 07:05:46,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:05:50,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:05:50,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=634106.6666666666, ans=0.125 2023-09-30 07:05:51,855 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.888e+02 2.165e+02 2.522e+02 3.456e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 07:05:54,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:54,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:05:55,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:55,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 07:05:55,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:55,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:57,659 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=634173.3333333334, ans=0.0 2023-09-30 07:05:58,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:01,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=15.0 2023-09-30 07:06:02,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:06:05,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:06:07,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:10,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 07:06:10,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 07:06:12,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:12,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:06:12,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:06:12,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:12,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:06:15,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:06:15,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:20,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:22,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:24,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:29,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 07:06:29,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:29,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:30,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:06:30,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:34,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=634306.6666666666, ans=0.04949747468305833 2023-09-30 07:06:35,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:36,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:06:36,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:36,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:06:38,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:06:38,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:06:43,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:43,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:43,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:46,850 INFO [train.py:1039] (2/4) Epoch 18, batch 4850, loss[loss=0.1894, simple_loss=0.2774, pruned_loss=0.05071, over 24666.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2567, pruned_loss=0.05332, over 4727710.04 frames. ], batch size: 73, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:06:47,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 07:06:48,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 07:06:48,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:48,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:50,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:06:50,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:51,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:07:01,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 07:07:01,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:04,330 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=634440.0, ans=0.0 2023-09-30 07:07:07,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:07,947 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.42 vs. limit=15.0 2023-09-30 07:07:08,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:07:08,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:11,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:13,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:07:16,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:07:16,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 07:07:18,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:07:21,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:07:21,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:07:23,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:07:23,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 07:07:25,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:25,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:25,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=634506.6666666666, ans=0.1 2023-09-30 07:07:30,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:30,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 07:07:31,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 07:07:33,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:07:40,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:07:40,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=634573.3333333334, ans=0.125 2023-09-30 07:07:41,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 07:07:41,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:07:43,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:07:43,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:07:44,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 07:07:44,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:47,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 07:07:47,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:50,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:07:51,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 07:07:55,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=634640.0, ans=0.125 2023-09-30 07:08:01,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:08,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:08:08,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:09,738 INFO [train.py:1039] (2/4) Epoch 18, batch 4900, loss[loss=0.1634, simple_loss=0.2527, pruned_loss=0.03701, over 24461.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2553, pruned_loss=0.05274, over 4711099.17 frames. ], batch size: 69, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:08:13,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 07:08:13,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:08:18,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:19,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:21,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:08:23,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 07:08:23,779 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=634706.6666666666, ans=0.1 2023-09-30 07:08:28,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 07:08:34,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 07:08:34,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 07:08:35,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:35,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:35,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:08:35,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:36,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:08:37,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 07:08:39,892 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.805e+02 1.986e+02 2.156e+02 3.448e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 07:08:40,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 07:08:41,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:08:43,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:08:43,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=634840.0, ans=0.125 2023-09-30 07:08:44,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:47,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:08:49,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:51,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:51,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 07:08:51,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=634840.0, ans=0.0 2023-09-30 07:08:52,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:08:55,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:55,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 07:08:55,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 07:08:58,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 07:09:00,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:09:01,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:01,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:09:03,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:03,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:09:03,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:09:04,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 07:09:07,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:09,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:09:10,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:09:14,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 07:09:16,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:09:17,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:09:17,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 07:09:24,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:25,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:09:27,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 07:09:27,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:27,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:09:29,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:32,646 INFO [train.py:1039] (2/4) Epoch 18, batch 4950, loss[loss=0.1837, simple_loss=0.2667, pruned_loss=0.05034, over 24461.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2551, pruned_loss=0.05263, over 4718780.05 frames. ], batch size: 63, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:09:33,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:33,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:09:33,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:33,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 07:09:33,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=635040.0, ans=0.125 2023-09-30 07:09:34,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:09:35,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=635040.0, ans=0.2 2023-09-30 07:09:37,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:37,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:39,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=635040.0, ans=0.0 2023-09-30 07:09:41,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 07:09:41,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 07:09:41,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:09:42,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 07:09:42,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:42,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:44,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:09:44,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:09:48,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:48,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:09:49,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:09:51,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:52,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:52,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:54,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=635106.6666666666, ans=0.125 2023-09-30 07:09:56,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:09:58,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=635106.6666666666, ans=0.1 2023-09-30 07:10:00,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=635106.6666666666, ans=0.125 2023-09-30 07:10:03,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:03,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=635106.6666666666, ans=0.125 2023-09-30 07:10:05,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:10:07,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:08,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:10,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:10:11,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 07:10:13,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 07:10:14,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:17,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:10:17,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:10:19,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:10:19,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:10:20,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:10:21,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:22,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:10:25,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:10:26,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:28,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:28,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 07:10:28,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:10:29,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:10:35,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:10:36,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:10:36,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:10:38,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:38,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:10:38,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:10:40,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:10:41,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:10:41,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:43,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 07:10:46,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:10:51,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 07:10:51,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:10:54,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=635306.6666666666, ans=0.125 2023-09-30 07:10:56,580 INFO [train.py:1039] (2/4) Epoch 18, batch 5000, loss[loss=0.1794, simple_loss=0.2508, pruned_loss=0.05407, over 23476.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2545, pruned_loss=0.05273, over 4708081.83 frames. ], batch size: 134, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:10:58,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:58,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:01,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 07:11:01,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 07:11:02,342 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.46 vs. limit=15.0 2023-09-30 07:11:03,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:05,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 07:11:07,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:11:07,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:11:08,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 07:11:08,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:08,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=635373.3333333334, ans=0.95 2023-09-30 07:11:10,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:11,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 07:11:11,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:11,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:13,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 07:11:15,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 07:11:16,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:11:16,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 07:11:16,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:11:18,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:18,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:11:18,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 07:11:18,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 07:11:19,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 07:11:21,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:22,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:22,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 07:11:22,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:23,295 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=635440.0, ans=0.05 2023-09-30 07:11:24,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:26,589 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.857e+02 2.111e+02 2.507e+02 3.855e+02, threshold=4.222e+02, percent-clipped=0.0 2023-09-30 07:11:26,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:28,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:11:28,588 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=635506.6666666666, ans=0.125 2023-09-30 07:11:28,625 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:11:29,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 07:11:31,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:11:32,230 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.78 vs. limit=12.0 2023-09-30 07:11:32,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:11:35,990 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 07:11:39,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:40,115 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:11:41,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:41,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:11:43,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 07:11:44,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:44,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:45,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:11:47,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 07:11:47,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:47,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=635573.3333333334, ans=0.125 2023-09-30 07:11:50,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:51,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:56,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 07:12:01,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:12,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:12:13,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:15,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:12:15,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:15,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:12:15,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:12:16,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:20,240 INFO [train.py:1039] (2/4) Epoch 18, batch 5050, loss[loss=0.1654, simple_loss=0.2383, pruned_loss=0.04623, over 23611.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2551, pruned_loss=0.05283, over 4712383.53 frames. ], batch size: 149, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:12:20,778 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=635706.6666666666, ans=0.2 2023-09-30 07:12:21,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:23,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 07:12:24,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:12:26,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:27,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:12:28,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 07:12:29,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:29,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:12:32,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:12:35,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:12:35,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:12:35,993 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=635773.3333333334, ans=0.125 2023-09-30 07:12:39,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.61 vs. limit=15.0 2023-09-30 07:12:45,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 07:12:45,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:12:47,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:12:47,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 07:12:49,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:12:49,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:49,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=635773.3333333334, ans=0.125 2023-09-30 07:12:50,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:52,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:12:52,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 07:12:52,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 07:12:54,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:58,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:12:59,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:59,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 07:13:01,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:04,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 07:13:04,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=635840.0, ans=0.125 2023-09-30 07:13:05,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:13:05,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:13:07,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:08,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:13:12,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:13,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=635906.6666666666, ans=0.0 2023-09-30 07:13:15,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:13:16,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:16,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:13:16,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:13:16,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 07:13:17,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:13:18,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:13:23,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:23,179 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 07:13:23,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:13:26,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:28,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:28,186 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 07:13:30,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:30,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 07:13:30,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:35,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:35,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:35,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 07:13:35,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=635973.3333333334, ans=0.125 2023-09-30 07:13:37,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 07:13:40,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:40,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:13:40,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:13:41,725 INFO [train.py:1039] (2/4) Epoch 18, batch 5100, loss[loss=0.2026, simple_loss=0.2651, pruned_loss=0.07, over 23425.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2552, pruned_loss=0.05258, over 4725950.20 frames. ], batch size: 285, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:13:43,427 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 07:13:44,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:50,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 07:13:50,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 07:13:51,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:53,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:54,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:56,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 07:13:56,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 07:14:01,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:14:01,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:14:06,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:14:08,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 07:14:10,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:10,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:14:10,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 07:14:11,816 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.826e+02 2.022e+02 2.261e+02 3.082e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 07:14:14,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 07:14:17,842 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 07:14:19,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:19,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 07:14:19,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 07:14:21,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=636173.3333333334, ans=0.125 2023-09-30 07:14:21,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=636173.3333333334, ans=0.1 2023-09-30 07:14:24,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:32,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:14:36,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 07:14:36,130 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 07:14:36,145 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 07:14:37,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 07:14:37,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:39,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 07:14:44,657 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 07:14:47,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:14:49,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:14:52,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 07:14:53,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:14:53,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 07:15:00,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:15:00,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:15:00,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:15:02,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:15:02,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:15:02,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:15:03,674 INFO [train.py:1039] (2/4) Epoch 18, batch 5150, loss[loss=0.175, simple_loss=0.2438, pruned_loss=0.05312, over 23802.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2558, pruned_loss=0.05302, over 4724880.53 frames. ], batch size: 195, lr: 5.66e-03, grad_scale: 8.0 2023-09-30 07:15:03,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 07:15:03,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 07:15:05,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 07:15:05,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:15:06,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 07:15:10,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:10,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:15:12,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:14,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:17,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:15:17,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 07:15:21,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:21,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:15:22,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:15:22,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:22,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:22,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:15:22,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:15:24,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 07:15:25,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:15:27,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:15:30,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:15:32,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 07:15:34,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:15:39,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:15:40,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 07:15:47,126 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.30 vs. limit=12.0 2023-09-30 07:15:47,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:51,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=636506.6666666666, ans=0.1 2023-09-30 07:15:54,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:56,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:59,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=636573.3333333334, ans=0.2 2023-09-30 07:16:00,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:00,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:03,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 07:16:06,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:16:08,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:16:08,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:16:08,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=636640.0, ans=0.125 2023-09-30 07:16:12,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:13,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:15,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 07:16:17,120 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=636640.0, ans=0.1 2023-09-30 07:16:18,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:16:21,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:16:21,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=636640.0, ans=0.1 2023-09-30 07:16:23,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:16:24,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:16:24,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:16:24,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:16:24,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:16:24,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:16:27,166 INFO [train.py:1039] (2/4) Epoch 18, batch 5200, loss[loss=0.2304, simple_loss=0.2932, pruned_loss=0.08382, over 19509.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2567, pruned_loss=0.05367, over 4697943.98 frames. ], batch size: 388, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:16:28,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:16:30,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:16:35,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:39,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 07:16:41,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:16:42,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:44,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:46,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:16:46,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:48,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 07:16:51,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:16:52,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:55,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 07:16:57,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=636773.3333333334, ans=0.125 2023-09-30 07:16:58,251 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.839e+02 2.029e+02 2.263e+02 2.861e+02, threshold=4.059e+02, percent-clipped=0.0 2023-09-30 07:16:58,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:16:58,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=636840.0, ans=0.09899494936611666 2023-09-30 07:16:59,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:17:00,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 07:17:01,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 07:17:03,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 07:17:03,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:03,183 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 07:17:03,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:17:04,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:06,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:17:06,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 07:17:07,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:09,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:14,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 07:17:14,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 07:17:14,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 07:17:14,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=636906.6666666666, ans=0.0 2023-09-30 07:17:19,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 07:17:19,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:17:25,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:17:27,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:27,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=636906.6666666666, ans=0.0 2023-09-30 07:17:28,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 07:17:28,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:30,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:17:30,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:30,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:17:34,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:36,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:17:39,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:39,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:39,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:44,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:46,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 07:17:47,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:47,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:17:49,441 INFO [train.py:1039] (2/4) Epoch 18, batch 5250, loss[loss=0.1883, simple_loss=0.2705, pruned_loss=0.053, over 24385.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2558, pruned_loss=0.05339, over 4710537.00 frames. ], batch size: 77, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:17:49,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:49,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:17:49,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=637040.0, ans=0.0 2023-09-30 07:17:51,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:17:54,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:56,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=637040.0, ans=0.1 2023-09-30 07:17:57,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:59,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:18:01,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:18:02,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=637040.0, ans=0.125 2023-09-30 07:18:06,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:18:08,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:18:09,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:18:11,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:18:13,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 07:18:13,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:18:14,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:18:14,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=637106.6666666666, ans=0.125 2023-09-30 07:18:39,351 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.48 vs. limit=12.0 2023-09-30 07:18:53,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=637306.6666666666, ans=0.1 2023-09-30 07:19:04,838 INFO [train.py:1039] (2/4) Epoch 18, batch 5300, loss[loss=0.1839, simple_loss=0.2453, pruned_loss=0.0613, over 23735.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2545, pruned_loss=0.05294, over 4710006.71 frames. ], batch size: 212, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:19:13,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=637373.3333333334, ans=0.1 2023-09-30 07:19:20,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:19:20,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 07:19:20,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 07:19:20,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:21,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:21,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:21,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:21,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:21,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:19:21,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:21,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:19:22,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:19:22,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 07:19:22,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 07:19:22,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 07:19:22,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:19:22,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 07:19:22,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 07:19:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:24,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:24,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:24,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:24,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:19:24,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:24,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:24,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:25,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:25,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:25,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:19:25,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:25,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:19:26,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 07:19:26,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:26,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:26,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 07:19:26,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 07:19:26,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:19:27,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:19:27,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 07:19:27,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 07:19:27,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:28,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:19:28,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:28,805 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 07:19:28,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 07:19:28,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:19:29,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:29,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 07:19:29,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 07:19:29,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 07:19:29,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:38,623 INFO [train.py:1039] (2/4) Epoch 19, batch 0, loss[loss=0.1668, simple_loss=0.2508, pruned_loss=0.04142, over 24644.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2508, pruned_loss=0.04142, over 24644.00 frames. ], batch size: 68, lr: 5.50e-03, grad_scale: 32.0 2023-09-30 07:19:38,623 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 07:19:52,801 INFO [train.py:1071] (2/4) Epoch 19, validation: loss=0.3241, simple_loss=0.2677, pruned_loss=0.1902, over 1125622.00 frames. 2023-09-30 07:19:52,801 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 07:19:55,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 07:19:55,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:19:58,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:20:01,906 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.881e+02 2.156e+02 2.381e+02 5.566e+02, threshold=4.312e+02, percent-clipped=3.0 2023-09-30 07:20:06,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:06,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:20:06,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:07,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 07:20:09,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 07:20:12,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:12,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:17,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:19,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:19,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:20:19,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:20,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 07:20:22,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:31,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:20:31,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:34,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 07:20:37,437 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.73 vs. limit=22.5 2023-09-30 07:20:39,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:20:39,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:20:41,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:45,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:20:46,390 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.31 vs. limit=15.0 2023-09-30 07:20:47,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=637660.0, ans=0.1 2023-09-30 07:20:48,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:52,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=637660.0, ans=0.05 2023-09-30 07:20:55,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 07:20:59,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 07:20:59,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:20:59,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:01,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:21:02,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:04,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 07:21:05,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:07,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:12,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:21:13,743 INFO [train.py:1039] (2/4) Epoch 19, batch 50, loss[loss=0.178, simple_loss=0.2479, pruned_loss=0.054, over 23919.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2574, pruned_loss=0.05244, over 1072257.09 frames. ], batch size: 195, lr: 5.50e-03, grad_scale: 16.0 2023-09-30 07:21:15,539 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 07:21:17,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:21:17,450 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.45 vs. limit=22.5 2023-09-30 07:21:21,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:23,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:21:23,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 07:21:23,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:21:24,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:21:26,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:28,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:30,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:35,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 07:21:35,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:38,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=637860.0, ans=0.2 2023-09-30 07:21:42,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:21:45,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 07:21:47,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 07:21:48,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:21:48,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:21:48,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:50,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:51,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:21:51,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:21:51,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:22:01,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:01,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:01,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:22:03,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 07:22:04,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:22:06,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:22:06,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 07:22:08,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:10,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 07:22:16,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:22:16,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:16,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:19,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:19,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:22,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 07:22:22,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 07:22:23,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:25,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:26,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:22:28,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:29,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 07:22:29,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 07:22:30,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:22:32,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:33,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:22:34,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 07:22:34,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 07:22:36,020 INFO [train.py:1039] (2/4) Epoch 19, batch 100, loss[loss=0.1812, simple_loss=0.2626, pruned_loss=0.04988, over 24347.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2569, pruned_loss=0.05314, over 1882884.19 frames. ], batch size: 77, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:22:36,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:37,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:39,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:22:39,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:22:42,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:22:45,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:22:47,047 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.406e+02 1.850e+02 1.971e+02 2.245e+02 4.662e+02, threshold=3.942e+02, percent-clipped=2.0 2023-09-30 07:22:50,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:22:50,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 07:22:50,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:55,601 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=638193.3333333334, ans=0.125 2023-09-30 07:22:56,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:22:56,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:58,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:58,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:58,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:59,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 07:23:02,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:23:02,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:02,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:02,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:23:06,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 07:23:06,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:08,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:09,203 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=22.5 2023-09-30 07:23:09,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:23:11,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:23:12,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=638260.0, ans=0.125 2023-09-30 07:23:15,053 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 07:23:15,080 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 07:23:17,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:17,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:23:21,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:23:23,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:26,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:30,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:31,715 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 07:23:33,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:23:34,275 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=638326.6666666666, ans=0.125 2023-09-30 07:23:35,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:23:37,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:23:40,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:42,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=638393.3333333334, ans=0.1 2023-09-30 07:23:43,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:47,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:23:47,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:23:50,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:51,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:53,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:53,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:23:55,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:55,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 07:23:55,299 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 07:23:55,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:56,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:23:56,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:23:56,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:58,210 INFO [train.py:1039] (2/4) Epoch 19, batch 150, loss[loss=0.1856, simple_loss=0.2545, pruned_loss=0.05833, over 23602.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2554, pruned_loss=0.05207, over 2527826.53 frames. ], batch size: 256, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:23:58,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 07:23:58,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:23:58,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:23:58,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:23:59,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:01,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:01,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:24:03,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:24:06,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:11,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:24:11,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:12,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:14,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:15,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:17,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:24:17,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:22,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 07:24:22,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 07:24:22,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 07:24:25,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:24:25,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:24:27,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:24:29,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:24:29,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:29,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:29,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:32,350 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 07:24:33,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:35,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=638593.3333333334, ans=0.0 2023-09-30 07:24:38,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.11 vs. limit=15.0 2023-09-30 07:24:40,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:45,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:24:45,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 07:24:47,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638660.0, ans=0.1 2023-09-30 07:24:50,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:24:50,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:50,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:24:52,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:24:54,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:54,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:24:56,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=638660.0, ans=0.1 2023-09-30 07:24:57,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:57,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 07:24:59,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=638660.0, ans=0.125 2023-09-30 07:25:01,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:03,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:03,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:25:03,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:25:06,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:09,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 07:25:11,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:25:13,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:25:13,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=638726.6666666666, ans=0.0 2023-09-30 07:25:13,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=638726.6666666666, ans=0.0 2023-09-30 07:25:14,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:16,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:25:16,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 07:25:16,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:25:16,208 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 07:25:18,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=638726.6666666666, ans=0.0 2023-09-30 07:25:19,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:21,397 INFO [train.py:1039] (2/4) Epoch 19, batch 200, loss[loss=0.1487, simple_loss=0.2259, pruned_loss=0.03575, over 20843.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.257, pruned_loss=0.05318, over 3018485.88 frames. ], batch size: 45, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:25:24,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:25:24,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:25:25,509 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.22 vs. limit=15.0 2023-09-30 07:25:28,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 07:25:28,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:28,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=638793.3333333334, ans=0.125 2023-09-30 07:25:29,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:31,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=638793.3333333334, ans=0.0 2023-09-30 07:25:32,721 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.866e+02 2.060e+02 2.341e+02 3.608e+02, threshold=4.119e+02, percent-clipped=0.0 2023-09-30 07:25:32,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 07:25:34,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:25:36,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:36,992 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:25:38,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:40,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:25:40,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:40,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:58,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:26:00,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:26:00,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:26:00,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:26:02,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:26:02,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:26:03,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:03,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:26:05,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:05,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:07,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=638926.6666666666, ans=0.0 2023-09-30 07:26:08,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 07:26:08,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:26:08,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:13,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:26:15,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=638993.3333333334, ans=0.025 2023-09-30 07:26:18,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:21,499 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.19 vs. limit=10.0 2023-09-30 07:26:22,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=638993.3333333334, ans=0.07 2023-09-30 07:26:25,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:26,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:26:28,666 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=639060.0, ans=0.07 2023-09-30 07:26:33,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:35,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 07:26:36,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:36,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:26:36,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:39,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:26:41,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 07:26:42,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:26:42,588 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 07:26:43,974 INFO [train.py:1039] (2/4) Epoch 19, batch 250, loss[loss=0.1602, simple_loss=0.2419, pruned_loss=0.03928, over 24481.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2562, pruned_loss=0.05237, over 3402095.85 frames. ], batch size: 66, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:26:45,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:45,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=639126.6666666666, ans=0.0 2023-09-30 07:26:47,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:26:50,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:50,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:54,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:26:54,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:56,065 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=639126.6666666666, ans=0.0 2023-09-30 07:26:57,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:27:00,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:01,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=639193.3333333334, ans=0.125 2023-09-30 07:27:12,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:13,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:27:15,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:27:18,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:27:20,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:27:21,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:27:22,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:23,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:27:25,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:27:27,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:30,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:27:33,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 07:27:33,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:35,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:27:35,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:27:35,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:27:36,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:27:37,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:27:37,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:27:40,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:41,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:27:43,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:46,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:27:46,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=639326.6666666666, ans=0.1 2023-09-30 07:27:51,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:52,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=639393.3333333334, ans=0.2 2023-09-30 07:27:54,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:58,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:58,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=639393.3333333334, ans=0.125 2023-09-30 07:27:59,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:27:59,988 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=639393.3333333334, ans=0.0 2023-09-30 07:28:03,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 07:28:05,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:05,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:28:05,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 07:28:07,423 INFO [train.py:1039] (2/4) Epoch 19, batch 300, loss[loss=0.1993, simple_loss=0.2748, pruned_loss=0.06187, over 23713.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2546, pruned_loss=0.05201, over 3684814.64 frames. ], batch size: 85, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:28:07,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:28:09,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:28:09,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 07:28:09,405 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=639460.0, ans=0.125 2023-09-30 07:28:12,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:28:13,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:28:16,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:28:18,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.819e+02 2.024e+02 2.204e+02 2.893e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-30 07:28:18,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 07:28:20,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:28:20,535 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=639460.0, ans=0.2 2023-09-30 07:28:21,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:28:21,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 07:28:21,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:23,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=639526.6666666666, ans=0.05 2023-09-30 07:28:26,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:28:32,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:28:32,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 07:28:37,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 07:28:37,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:41,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:42,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:42,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 07:28:42,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:28:44,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:28:46,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:28:47,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:52,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:28:52,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 07:28:52,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:28:56,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:56,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 07:28:57,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:02,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:29:05,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=639660.0, ans=0.1 2023-09-30 07:29:06,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:29:06,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 07:29:11,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:11,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:29:14,057 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:17,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:29:17,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 07:29:17,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:29:18,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:19,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 07:29:22,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:22,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:22,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=639726.6666666666, ans=0.0 2023-09-30 07:29:23,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:23,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:24,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=639726.6666666666, ans=0.125 2023-09-30 07:29:25,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:28,613 INFO [train.py:1039] (2/4) Epoch 19, batch 350, loss[loss=0.1809, simple_loss=0.2688, pruned_loss=0.04651, over 24438.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2539, pruned_loss=0.05164, over 3910609.35 frames. ], batch size: 69, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:29:30,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:30,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:29:32,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=639793.3333333334, ans=0.125 2023-09-30 07:29:33,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:34,341 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.71 vs. limit=15.0 2023-09-30 07:29:40,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:42,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:43,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:46,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 07:29:47,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:48,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 07:29:50,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:52,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 07:29:53,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:55,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 07:29:58,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:29:59,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:30:00,356 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.50 vs. limit=15.0 2023-09-30 07:30:01,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:30:01,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:01,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:03,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:03,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:03,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:30:06,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:06,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:14,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:30:14,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:30:15,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:30:17,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:24,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff3.min_abs, batch_count=639993.3333333334, ans=0.2 2023-09-30 07:30:26,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 07:30:26,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:30,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:30,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:30,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:30:33,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 07:30:35,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:36,566 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 07:30:36,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 07:30:36,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:37,113 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:30:40,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:40,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 07:30:40,625 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-09-30 07:30:42,471 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.16 vs. limit=22.5 2023-09-30 07:30:43,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:48,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:30:48,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:50,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:50,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:51,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=640060.0, ans=0.0 2023-09-30 07:30:51,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=640060.0, ans=0.0 2023-09-30 07:30:52,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:55,829 INFO [train.py:1039] (2/4) Epoch 19, batch 400, loss[loss=0.1768, simple_loss=0.2521, pruned_loss=0.05075, over 23471.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2528, pruned_loss=0.05111, over 4094039.67 frames. ], batch size: 93, lr: 5.49e-03, grad_scale: 32.0 2023-09-30 07:30:56,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:59,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:30:59,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 07:30:59,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:00,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:03,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:31:03,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:06,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:06,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:07,646 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.862e+02 2.041e+02 2.218e+02 3.370e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-30 07:31:07,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 07:31:10,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 07:31:10,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:11,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=640193.3333333334, ans=0.125 2023-09-30 07:31:13,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 07:31:13,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:17,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:31:17,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:17,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 07:31:17,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=640193.3333333334, ans=0.125 2023-09-30 07:31:19,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:31:19,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:19,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:20,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:21,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=640193.3333333334, ans=0.0 2023-09-30 07:31:22,291 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 07:31:25,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 07:31:30,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:31,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:32,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 07:31:32,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 07:31:35,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:31:37,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:31:46,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 07:31:47,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:31:49,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 07:31:51,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:53,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:31:54,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 07:32:00,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:32:00,636 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=640326.6666666666, ans=0.1 2023-09-30 07:32:02,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=640393.3333333334, ans=0.0 2023-09-30 07:32:03,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:32:05,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:32:06,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:08,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 07:32:11,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:32:11,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 07:32:12,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:32:12,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:32:15,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 07:32:19,082 INFO [train.py:1039] (2/4) Epoch 19, batch 450, loss[loss=0.1827, simple_loss=0.2694, pruned_loss=0.04798, over 24359.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2536, pruned_loss=0.05148, over 4232555.28 frames. ], batch size: 77, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:32:19,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:32:19,575 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=640460.0, ans=0.0 2023-09-30 07:32:20,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:32:20,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:32:22,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 07:32:22,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:32:23,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:32:25,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:32:25,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 07:32:25,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:32:26,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:32:27,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=640460.0, ans=0.125 2023-09-30 07:32:30,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:32:39,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:39,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:32:40,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 07:32:42,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 07:32:46,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:32:49,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:51,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:32:55,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:55,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:55,897 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.78 vs. limit=6.0 2023-09-30 07:32:58,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 07:32:58,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 07:33:01,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 07:33:03,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:03,630 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=640593.3333333334, ans=0.125 2023-09-30 07:33:04,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:06,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:33:06,557 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 07:33:06,573 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 07:33:08,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:33:10,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:33:11,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:33:13,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=640660.0, ans=0.0 2023-09-30 07:33:14,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:33:14,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:33:14,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:33:15,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=640660.0, ans=0.0 2023-09-30 07:33:16,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 07:33:17,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:20,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:33:21,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:33:22,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 07:33:25,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:33:25,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=640726.6666666666, ans=0.1 2023-09-30 07:33:27,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 07:33:29,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 07:33:29,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:34,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:33:36,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:33:37,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:33:39,193 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 07:33:40,633 INFO [train.py:1039] (2/4) Epoch 19, batch 500, loss[loss=0.1865, simple_loss=0.2573, pruned_loss=0.05781, over 23508.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2551, pruned_loss=0.05231, over 4339816.06 frames. ], batch size: 285, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:33:44,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:44,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:33:46,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:46,708 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 07:33:48,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=640793.3333333334, ans=0.05 2023-09-30 07:33:49,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 07:33:49,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:52,501 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.803e+02 2.032e+02 2.368e+02 3.527e+02, threshold=4.065e+02, percent-clipped=0.0 2023-09-30 07:33:52,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:33:57,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:33:58,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:33:59,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=640860.0, ans=0.0 2023-09-30 07:34:00,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:34:00,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:34:00,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:07,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=640860.0, ans=0.125 2023-09-30 07:34:12,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:12,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:34:14,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:34:14,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:14,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 07:34:15,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:34:17,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:34:19,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:34:20,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:34:20,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:22,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 07:34:25,900 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 07:34:30,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:30,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:31,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:31,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:32,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:34:35,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 07:34:38,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:34:39,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:44,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:34:46,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:53,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:53,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=641060.0, ans=0.0 2023-09-30 07:34:55,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-09-30 07:34:57,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 07:34:57,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:57,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:00,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 07:35:00,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:35:03,222 INFO [train.py:1039] (2/4) Epoch 19, batch 550, loss[loss=0.1625, simple_loss=0.2505, pruned_loss=0.03729, over 24506.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2559, pruned_loss=0.05278, over 4431321.15 frames. ], batch size: 66, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:35:03,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:05,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=641126.6666666666, ans=0.125 2023-09-30 07:35:07,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 07:35:08,206 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:35:09,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 07:35:09,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:09,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 07:35:11,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:35:11,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:11,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:11,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=641126.6666666666, ans=0.2 2023-09-30 07:35:12,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:12,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:35:14,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:35:17,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:18,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 07:35:18,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:35:18,424 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=641193.3333333334, ans=0.125 2023-09-30 07:35:24,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:24,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:26,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:26,931 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=641193.3333333334, ans=0.125 2023-09-30 07:35:28,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:29,426 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.52 vs. limit=15.0 2023-09-30 07:35:33,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 07:35:33,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 07:35:36,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:35:36,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=641260.0, ans=0.125 2023-09-30 07:35:38,470 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.19 vs. limit=12.0 2023-09-30 07:35:42,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:35:42,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:43,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:35:47,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:47,140 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 07:35:48,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:50,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:35:52,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:53,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:35:53,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:35:55,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:56,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 07:35:57,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 07:35:59,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:59,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:59,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:35:59,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:36:05,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:36:06,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:36:08,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:36:09,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:09,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:36:12,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:36:12,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:14,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:36:14,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:15,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:36:15,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:36:19,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=641393.3333333334, ans=0.07 2023-09-30 07:36:19,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=641393.3333333334, ans=0.0 2023-09-30 07:36:22,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 07:36:24,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 07:36:25,846 INFO [train.py:1039] (2/4) Epoch 19, batch 600, loss[loss=0.2454, simple_loss=0.3092, pruned_loss=0.09082, over 19705.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2569, pruned_loss=0.05345, over 4475173.72 frames. ], batch size: 388, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:36:26,079 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:36:27,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:36:27,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:36,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:36:37,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:36:39,267 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.814e+02 2.073e+02 2.344e+02 3.797e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 07:36:39,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 07:36:41,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:36:42,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:36:45,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:47,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 07:36:47,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:36:52,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=641526.6666666666, ans=0.125 2023-09-30 07:36:55,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 07:36:58,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:36:58,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:58,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:36:59,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=641593.3333333334, ans=0.05 2023-09-30 07:37:04,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:37:04,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:37:05,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:08,347 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:37:11,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:37:17,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:17,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:37:17,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:37:18,936 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.24 vs. limit=15.0 2023-09-30 07:37:21,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.12 vs. limit=10.0 2023-09-30 07:37:25,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 07:37:30,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:37:31,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:37:33,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=641726.6666666666, ans=0.0 2023-09-30 07:37:35,131 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=641726.6666666666, ans=0.1 2023-09-30 07:37:36,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 07:37:36,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:37:40,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 07:37:40,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:37:40,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:37:46,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:37:48,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:37:49,748 INFO [train.py:1039] (2/4) Epoch 19, batch 650, loss[loss=0.1821, simple_loss=0.2451, pruned_loss=0.05956, over 23831.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2557, pruned_loss=0.05302, over 4524617.36 frames. ], batch size: 195, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:37:49,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:37:51,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:37:53,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:37:56,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 07:37:57,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:57,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=641793.3333333334, ans=0.125 2023-09-30 07:38:01,673 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.21 vs. limit=22.5 2023-09-30 07:38:03,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:38:03,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:08,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:09,903 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.91 vs. limit=15.0 2023-09-30 07:38:14,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 07:38:16,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:16,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:20,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:20,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:38:23,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:23,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:24,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:38:24,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:26,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:38:27,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:38:27,981 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 07:38:27,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:29,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:31,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=641926.6666666666, ans=0.125 2023-09-30 07:38:32,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:32,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:34,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:34,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:38:35,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 07:38:37,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:38:37,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:38:37,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:38:37,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:39,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:38:41,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 07:38:43,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 07:38:43,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:43,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:44,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:38:44,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:46,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:48,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=641993.3333333334, ans=0.0 2023-09-30 07:38:51,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:53,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:54,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:59,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:59,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:39:00,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:39:07,214 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=642060.0, ans=0.0 2023-09-30 07:39:08,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:39:08,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:08,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:08,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:11,448 INFO [train.py:1039] (2/4) Epoch 19, batch 700, loss[loss=0.1778, simple_loss=0.2456, pruned_loss=0.05497, over 23699.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2535, pruned_loss=0.05232, over 4564289.13 frames. ], batch size: 232, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:39:14,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 07:39:16,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 07:39:19,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 07:39:19,680 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.17 vs. limit=22.5 2023-09-30 07:39:20,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:22,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:39:22,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 07:39:25,588 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.802e+02 1.961e+02 2.175e+02 2.904e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 07:39:27,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:29,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:39:30,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:31,721 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.48 vs. limit=15.0 2023-09-30 07:39:32,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:39:33,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:39:36,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:38,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=642193.3333333334, ans=0.0 2023-09-30 07:39:39,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:39:39,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:39:41,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 07:39:44,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 07:39:47,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:39:48,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:39:50,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:39:50,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=642260.0, ans=0.125 2023-09-30 07:39:55,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:39:57,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 07:40:03,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:03,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:40:04,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 07:40:09,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:40:10,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:14,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:40:16,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=642393.3333333334, ans=0.04949747468305833 2023-09-30 07:40:17,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:40:18,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 07:40:21,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 07:40:23,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 07:40:23,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=642393.3333333334, ans=0.1 2023-09-30 07:40:24,178 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.69 vs. limit=6.0 2023-09-30 07:40:27,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:29,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:29,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:40:33,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:33,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 07:40:34,988 INFO [train.py:1039] (2/4) Epoch 19, batch 750, loss[loss=0.1631, simple_loss=0.2468, pruned_loss=0.03966, over 24537.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2533, pruned_loss=0.0515, over 4604346.21 frames. ], batch size: 63, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:40:38,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 07:40:38,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 07:40:39,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 07:40:41,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 07:40:41,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 07:40:41,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:40:41,397 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=642460.0, ans=0.125 2023-09-30 07:40:42,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 07:40:44,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:44,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:40:45,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:47,510 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:48,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:40:48,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:50,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:40:50,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:40:53,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:40:55,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:55,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:56,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.76 vs. limit=15.0 2023-09-30 07:40:56,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 07:40:58,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:40:58,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:41:00,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:41:03,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:41:04,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 07:41:04,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:07,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 07:41:07,767 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 07:41:09,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 07:41:09,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:41:09,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:41:11,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:41:19,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:41:19,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:19,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:41:20,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:41:23,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:41:23,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 07:41:23,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:41:25,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:41:27,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:41:30,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:41:30,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 07:41:30,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:34,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=642660.0, ans=0.125 2023-09-30 07:41:37,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:41:39,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:41:41,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:41:43,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:41:46,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=642726.6666666666, ans=0.0 2023-09-30 07:41:47,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 07:41:47,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:41:49,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:55,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:56,754 INFO [train.py:1039] (2/4) Epoch 19, batch 800, loss[loss=0.1834, simple_loss=0.2619, pruned_loss=0.05247, over 24670.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2536, pruned_loss=0.05142, over 4642002.85 frames. ], batch size: 65, lr: 5.47e-03, grad_scale: 32.0 2023-09-30 07:41:56,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:42:03,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:03,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:04,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:42:04,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:06,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:07,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:09,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:10,442 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.847e+02 2.108e+02 2.482e+02 4.355e+02, threshold=4.217e+02, percent-clipped=1.0 2023-09-30 07:42:14,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:14,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:42:19,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 07:42:19,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:20,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:20,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:42:22,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:22,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 07:42:22,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:24,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 07:42:27,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:30,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:33,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:42:33,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:34,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:34,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:40,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:42:40,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:42:42,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:42:42,948 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 07:42:42,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 07:42:44,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:42:44,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:46,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:46,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:42:52,279 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 07:42:52,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 07:42:55,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:42:56,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:42:57,489 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.76 vs. limit=22.5 2023-09-30 07:43:01,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:43:04,428 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:04,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 07:43:05,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:43:07,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 07:43:14,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:14,837 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.59 vs. limit=15.0 2023-09-30 07:43:17,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:43:19,040 INFO [train.py:1039] (2/4) Epoch 19, batch 850, loss[loss=0.2035, simple_loss=0.2689, pruned_loss=0.0691, over 23622.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2538, pruned_loss=0.0517, over 4667153.84 frames. ], batch size: 285, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:43:19,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 07:43:19,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:43:19,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:21,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 07:43:21,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:24,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:43:26,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:26,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:43:28,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:43:29,721 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 07:43:29,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 07:43:29,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 07:43:32,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:32,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:43:34,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:34,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:35,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:43:37,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=643193.3333333334, ans=0.0 2023-09-30 07:43:39,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:40,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:40,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 07:43:43,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 07:43:44,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.56 vs. limit=10.0 2023-09-30 07:43:47,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:48,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 07:43:53,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 07:43:55,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 07:43:57,798 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 07:43:57,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:43:57,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:43:59,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:44:01,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 07:44:05,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:44:05,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:07,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:44:08,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:44:10,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:44:11,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:44:13,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 07:44:16,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=643326.6666666666, ans=0.2 2023-09-30 07:44:17,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:44:17,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:19,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:44:19,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:19,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:23,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:24,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:44:26,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:44:27,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:28,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:44:32,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=643393.3333333334, ans=0.0 2023-09-30 07:44:36,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:44:38,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:38,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 07:44:38,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:39,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:41,346 INFO [train.py:1039] (2/4) Epoch 19, batch 900, loss[loss=0.1494, simple_loss=0.2286, pruned_loss=0.03511, over 24296.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2544, pruned_loss=0.05193, over 4686032.26 frames. ], batch size: 61, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:44:42,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 07:44:49,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:44:50,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:52,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 07:44:53,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:44:55,223 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.916e+02 2.182e+02 2.478e+02 5.058e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 07:44:55,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 07:44:55,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:44:57,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:57,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:44:59,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:44:59,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:45:01,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.97 vs. limit=22.5 2023-09-30 07:45:03,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=643526.6666666666, ans=0.125 2023-09-30 07:45:03,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=643526.6666666666, ans=0.1 2023-09-30 07:45:08,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=643526.6666666666, ans=0.125 2023-09-30 07:45:11,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:11,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:45:11,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:45:11,398 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=643526.6666666666, ans=0.0 2023-09-30 07:45:14,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:18,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 07:45:20,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:45:24,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:45:26,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:45:26,593 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 07:45:26,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 07:45:26,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=643593.3333333334, ans=0.2 2023-09-30 07:45:26,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=643593.3333333334, ans=0.125 2023-09-30 07:45:28,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=643660.0, ans=0.2 2023-09-30 07:45:28,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=643660.0, ans=0.0 2023-09-30 07:45:31,678 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:45:33,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:45:33,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:45:33,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:45:41,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:41,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:45:44,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 07:45:44,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:48,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 07:45:50,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:45:50,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:52,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:45:52,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:45:56,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 07:45:56,756 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 07:45:58,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:45:59,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 07:46:01,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:02,754 INFO [train.py:1039] (2/4) Epoch 19, batch 950, loss[loss=0.1734, simple_loss=0.2501, pruned_loss=0.04829, over 16731.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2548, pruned_loss=0.05225, over 4678024.65 frames. ], batch size: 36, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:46:04,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 07:46:10,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:13,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:13,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:15,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:46:17,574 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 07:46:22,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:23,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:23,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:25,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:46:25,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 07:46:26,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:46:28,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:28,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 07:46:30,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:33,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:33,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:34,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:34,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 07:46:35,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=643926.6666666666, ans=0.125 2023-09-30 07:46:37,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:46:39,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:43,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:46:50,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:46:50,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:53,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 07:46:57,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:46:57,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:46:58,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:46:58,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:58,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:47:02,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 07:47:03,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:47:03,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=643993.3333333334, ans=0.125 2023-09-30 07:47:05,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:06,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:06,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 07:47:06,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:06,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:47:08,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 07:47:12,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:47:15,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:18,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=644060.0, ans=0.0 2023-09-30 07:47:21,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:23,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 07:47:23,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 07:47:26,778 INFO [train.py:1039] (2/4) Epoch 19, batch 1000, loss[loss=0.1875, simple_loss=0.2753, pruned_loss=0.04984, over 24385.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2543, pruned_loss=0.05233, over 4687508.46 frames. ], batch size: 77, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:47:26,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:29,022 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.81 vs. limit=15.0 2023-09-30 07:47:30,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 07:47:32,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:47:36,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:47:37,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=644126.6666666666, ans=0.125 2023-09-30 07:47:38,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 07:47:38,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 07:47:42,705 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 2.138e+02 2.514e+02 3.202e+02 5.752e+02, threshold=5.028e+02, percent-clipped=6.0 2023-09-30 07:47:43,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:47:44,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:45,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:47,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 07:47:48,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=644193.3333333334, ans=0.125 2023-09-30 07:47:51,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 07:47:53,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 07:47:53,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:47:56,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 07:47:56,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 07:47:56,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 07:47:57,205 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=644193.3333333334, ans=0.1 2023-09-30 07:47:58,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:00,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:09,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:09,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:48:11,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:11,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:11,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 07:48:11,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:13,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:48:13,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=644260.0, ans=0.125 2023-09-30 07:48:14,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:14,625 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 07:48:18,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 07:48:19,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 07:48:20,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 07:48:22,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:48:29,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:29,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:48:29,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:32,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:48:34,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 07:48:36,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:48:36,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 07:48:37,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 07:48:40,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:48:40,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:43,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:48:45,425 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=644393.3333333334, ans=0.125 2023-09-30 07:48:46,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:48:47,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=644460.0, ans=0.125 2023-09-30 07:48:48,068 INFO [train.py:1039] (2/4) Epoch 19, batch 1050, loss[loss=0.1797, simple_loss=0.2654, pruned_loss=0.04702, over 24538.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2523, pruned_loss=0.0513, over 4688655.42 frames. ], batch size: 71, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:48:48,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:50,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:48:51,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:48:53,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:48:54,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:56,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=644460.0, ans=0.125 2023-09-30 07:48:58,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:49:00,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:49:01,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:49:05,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:49:06,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:49:06,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:49:08,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:49:08,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 07:49:09,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:09,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 07:49:15,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:49:15,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 07:49:15,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:49:21,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:49:21,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:49:21,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:24,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 07:49:24,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 07:49:26,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:49:27,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 07:49:32,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 07:49:33,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:49:34,249 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=644593.3333333334, ans=0.2 2023-09-30 07:49:36,129 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=12.0 2023-09-30 07:49:38,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:49:40,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:49:40,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:49:41,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:49:43,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=644660.0, ans=0.0 2023-09-30 07:49:45,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:49:48,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 07:49:50,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 07:49:50,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 07:49:51,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:51,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:49:53,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 07:49:56,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:49:58,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:58,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:49:58,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:49:59,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:05,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:06,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 07:50:07,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:50:07,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 07:50:08,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 07:50:08,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:50:08,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=644726.6666666666, ans=10.0 2023-09-30 07:50:11,436 INFO [train.py:1039] (2/4) Epoch 19, batch 1100, loss[loss=0.1815, simple_loss=0.2634, pruned_loss=0.04982, over 24565.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2518, pruned_loss=0.05124, over 4691157.54 frames. ], batch size: 71, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:50:11,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:50:11,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=644793.3333333334, ans=0.125 2023-09-30 07:50:17,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:50:23,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:50:25,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:50:25,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:26,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 07:50:26,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:50:27,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=644860.0, ans=0.125 2023-09-30 07:50:28,192 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.773e+02 2.054e+02 2.605e+02 4.840e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 07:50:29,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:50:31,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:50:34,052 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.97 vs. limit=15.0 2023-09-30 07:50:34,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:50:34,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 07:50:36,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:50:39,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:39,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:50:41,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:50:44,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:50:48,253 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=644926.6666666666, ans=0.125 2023-09-30 07:50:49,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:50:51,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 07:50:53,154 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 07:50:53,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:56,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:58,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:50:59,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:51:00,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 07:51:01,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:51:01,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:51:01,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:51:01,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:01,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 07:51:01,864 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=644993.3333333334, ans=0.0 2023-09-30 07:51:08,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:51:08,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 07:51:11,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:51:18,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:51:21,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 07:51:21,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:51:22,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:26,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:26,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:28,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 07:51:28,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:51:28,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:30,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 07:51:30,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:51:30,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 07:51:31,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:51:33,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:51:34,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:51:35,356 INFO [train.py:1039] (2/4) Epoch 19, batch 1150, loss[loss=0.2086, simple_loss=0.2745, pruned_loss=0.07133, over 19360.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2527, pruned_loss=0.05134, over 4696743.77 frames. ], batch size: 389, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:51:38,840 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:51:40,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:43,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:51:44,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:44,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:51:46,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 07:51:46,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:51:49,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 07:51:52,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:52,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:51:55,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.88 vs. limit=15.0 2023-09-30 07:51:56,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 07:51:58,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:03,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:52:05,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:06,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 07:52:06,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:52:06,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:52:11,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 07:52:13,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:13,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=645260.0, ans=0.1 2023-09-30 07:52:14,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:52:24,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:33,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:33,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 07:52:34,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:34,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:41,968 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 07:52:43,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:50,484 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 07:52:55,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:52:58,258 INFO [train.py:1039] (2/4) Epoch 19, batch 1200, loss[loss=0.165, simple_loss=0.2443, pruned_loss=0.04282, over 24305.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2532, pruned_loss=0.05166, over 4695178.21 frames. ], batch size: 61, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:52:58,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:52:58,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:52:58,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:52:58,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=645460.0, ans=0.125 2023-09-30 07:53:01,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:07,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=645460.0, ans=0.1 2023-09-30 07:53:08,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:53:08,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:53:09,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:09,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:09,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:53:11,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:53:13,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:53:14,393 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.937e+02 2.117e+02 2.458e+02 3.944e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-30 07:53:14,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:14,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:18,263 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 07:53:21,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 07:53:23,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:53:26,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:53:29,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:29,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:53:30,820 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 07:53:30,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:39,139 INFO [scaling.py:1022] (2/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.92 vs. limit=8.0 2023-09-30 07:53:41,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:53:41,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:53:41,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 07:53:41,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:53:41,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=645593.3333333334, ans=0.2 2023-09-30 07:53:44,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 07:53:51,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 07:53:51,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:52,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:54,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:53:56,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:53:57,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:57,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:53:59,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:53:59,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 07:54:01,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:54:01,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:01,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:54:01,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=645660.0, ans=0.125 2023-09-30 07:54:02,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:02,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:06,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=645726.6666666666, ans=0.2 2023-09-30 07:54:07,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:54:09,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:54:09,734 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=645726.6666666666, ans=0.0 2023-09-30 07:54:13,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 07:54:15,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.22 vs. limit=12.0 2023-09-30 07:54:17,617 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 07:54:19,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:20,539 INFO [train.py:1039] (2/4) Epoch 19, batch 1250, loss[loss=0.2468, simple_loss=0.3028, pruned_loss=0.09538, over 19270.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2542, pruned_loss=0.05189, over 4694843.36 frames. ], batch size: 388, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:54:22,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:24,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:54:24,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:27,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 07:54:31,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:54:32,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:32,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 07:54:35,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:54:37,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:54:39,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:54:40,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:42,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:54:43,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:45,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:54:49,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:54:49,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:54:49,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:51,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:52,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:54:56,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:56,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:55:03,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 07:55:03,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:55:05,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=645926.6666666666, ans=0.125 2023-09-30 07:55:06,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:06,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 07:55:08,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:55:08,287 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 07:55:08,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:08,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:13,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:17,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:18,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=645993.3333333334, ans=0.125 2023-09-30 07:55:19,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:55:19,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 07:55:19,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 07:55:21,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 07:55:24,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:26,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 07:55:26,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:31,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 07:55:31,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:55:32,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 07:55:32,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:55:32,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:55:32,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:55:34,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:35,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 07:55:39,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:39,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:55:40,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:55:44,137 INFO [train.py:1039] (2/4) Epoch 19, batch 1300, loss[loss=0.1988, simple_loss=0.2668, pruned_loss=0.06538, over 23311.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2552, pruned_loss=0.05232, over 4704479.50 frames. ], batch size: 119, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:55:44,315 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:55:47,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:48,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 07:55:51,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:54,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:55:54,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:55:56,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:59,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:56:00,444 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.929e+02 2.084e+02 2.401e+02 3.525e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-30 07:56:00,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 07:56:05,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:56:07,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:56:08,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 07:56:12,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:56:15,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:15,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:17,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:56:19,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:21,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:56:21,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:56:22,061 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-09-30 07:56:22,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 07:56:28,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=646260.0, ans=0.1 2023-09-30 07:56:30,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:56:30,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:56:32,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 07:56:32,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:56:34,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:56:34,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=646326.6666666666, ans=0.125 2023-09-30 07:56:35,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:56:35,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 07:56:37,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:37,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 07:56:38,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:44,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:44,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:56:44,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=646326.6666666666, ans=0.2 2023-09-30 07:56:49,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 07:56:49,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 07:56:51,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 07:56:55,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:56:57,922 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 07:56:59,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:06,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 07:57:07,848 INFO [train.py:1039] (2/4) Epoch 19, batch 1350, loss[loss=0.1745, simple_loss=0.2195, pruned_loss=0.06475, over 19439.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2542, pruned_loss=0.05207, over 4702750.87 frames. ], batch size: 388, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:57:10,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:13,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:14,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:16,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:17,947 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=646460.0, ans=0.1 2023-09-30 07:57:19,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:57:19,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:25,317 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.36 vs. limit=22.5 2023-09-30 07:57:26,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:27,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 07:57:27,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=646526.6666666666, ans=0.2 2023-09-30 07:57:29,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:57:29,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:57:31,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 07:57:32,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:57:33,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:57:33,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 07:57:34,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 07:57:36,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 07:57:38,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:38,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 07:57:42,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=646593.3333333334, ans=0.0 2023-09-30 07:57:42,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=646593.3333333334, ans=0.125 2023-09-30 07:57:54,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:05,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:05,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:05,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 07:58:09,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:09,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 07:58:09,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:58:11,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:58:14,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:58:16,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 07:58:17,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:58:24,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 07:58:26,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 07:58:30,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=646793.3333333334, ans=0.1 2023-09-30 07:58:31,427 INFO [train.py:1039] (2/4) Epoch 19, batch 1400, loss[loss=0.1843, simple_loss=0.2583, pruned_loss=0.0551, over 23642.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2535, pruned_loss=0.05156, over 4709979.02 frames. ], batch size: 149, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:58:33,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 07:58:34,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:36,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:58:38,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:58:44,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 07:58:46,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 07:58:49,379 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.898e+02 2.160e+02 2.671e+02 3.929e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-30 07:58:55,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:58:57,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:00,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:59:00,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:59:01,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=646860.0, ans=0.0 2023-09-30 07:59:04,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:59:06,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:59:16,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:16,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:23,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 07:59:24,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:59:24,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:59:26,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:59:26,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:27,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:59:27,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:59:29,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:59:30,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 07:59:32,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:59:35,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:39,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:59:47,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 07:59:49,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:59:49,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:59:50,246 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.96 vs. limit=15.0 2023-09-30 07:59:51,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:59:52,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:59:52,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:59:54,268 INFO [train.py:1039] (2/4) Epoch 19, batch 1450, loss[loss=0.1539, simple_loss=0.2352, pruned_loss=0.0363, over 24435.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2532, pruned_loss=0.05111, over 4720070.41 frames. ], batch size: 63, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:59:56,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:59:59,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:59:59,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:59,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 08:00:04,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:04,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:00:06,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:00:07,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 08:00:07,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:00:09,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 08:00:11,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:11,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:11,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 08:00:14,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:14,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:00:14,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 08:00:14,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:15,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:00:18,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:21,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:24,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:00:24,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:00:25,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=647260.0, ans=0.025 2023-09-30 08:00:27,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:27,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:30,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:30,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:00:30,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:32,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:35,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 08:00:35,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=647260.0, ans=0.125 2023-09-30 08:00:37,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:39,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=647260.0, ans=0.015 2023-09-30 08:00:40,639 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 08:00:42,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:00:42,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=647326.6666666666, ans=0.0 2023-09-30 08:00:43,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:00:44,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=647326.6666666666, ans=0.0 2023-09-30 08:00:45,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:45,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 08:00:48,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:49,064 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:00:50,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 08:00:52,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 08:00:55,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:58,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:00:58,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:01:00,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 08:01:03,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 08:01:03,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 08:01:05,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:06,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:01:09,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=647393.3333333334, ans=0.0 2023-09-30 08:01:13,946 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=647393.3333333334, ans=0.125 2023-09-30 08:01:16,673 INFO [train.py:1039] (2/4) Epoch 19, batch 1500, loss[loss=0.1535, simple_loss=0.2341, pruned_loss=0.03643, over 24369.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2532, pruned_loss=0.05128, over 4719218.42 frames. ], batch size: 61, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 08:01:16,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 08:01:16,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:01:16,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:01:18,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:18,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:19,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.90 vs. limit=12.0 2023-09-30 08:01:19,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:01:20,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=647460.0, ans=0.125 2023-09-30 08:01:22,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 08:01:25,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:01:25,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:01:25,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:27,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:01:27,802 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=647460.0, ans=0.0 2023-09-30 08:01:28,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:01:31,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:34,837 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.890e+02 2.087e+02 2.413e+02 4.629e+02, threshold=4.174e+02, percent-clipped=1.0 2023-09-30 08:01:36,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:36,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 08:01:38,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:01:38,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:01:40,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:41,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 08:01:47,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 08:01:50,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:50,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 08:01:51,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:01:51,971 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=647593.3333333334, ans=0.95 2023-09-30 08:01:53,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:01:54,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:54,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:01:56,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 08:01:57,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:01:57,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:01:59,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 08:02:00,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:02:07,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:02:07,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 08:02:12,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:02:12,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:02:12,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=647660.0, ans=0.0 2023-09-30 08:02:17,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.37 vs. limit=15.0 2023-09-30 08:02:17,752 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 08:02:19,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:19,159 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 08:02:20,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:22,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:02:23,535 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 08:02:25,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:02:26,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 08:02:28,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:30,456 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=647726.6666666666, ans=0.0 2023-09-30 08:02:31,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:33,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:33,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:35,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:02:37,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 08:02:38,602 INFO [train.py:1039] (2/4) Epoch 19, batch 1550, loss[loss=0.1712, simple_loss=0.262, pruned_loss=0.04023, over 24370.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2533, pruned_loss=0.0509, over 4730125.90 frames. ], batch size: 74, lr: 5.45e-03, grad_scale: 8.0 2023-09-30 08:02:38,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 08:02:38,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:02:40,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 08:02:40,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 08:02:41,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=647793.3333333334, ans=0.1 2023-09-30 08:02:43,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:44,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:46,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:02:46,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:02:47,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:47,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:51,437 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 08:02:51,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:51,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:02:53,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:02:55,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:02:55,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 08:02:56,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:58,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 08:02:58,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 08:02:58,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 08:02:59,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:01,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:04,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:03:08,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 08:03:08,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 08:03:12,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=647926.6666666666, ans=0.125 2023-09-30 08:03:16,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:22,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:03:22,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:03:22,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:03:22,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 08:03:30,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:03:30,333 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=647993.3333333334, ans=0.125 2023-09-30 08:03:33,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:34,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:03:36,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:03:37,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:37,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 08:03:37,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:03:40,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:03:40,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:42,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:03:42,493 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 08:03:46,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:03:51,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 08:03:57,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:03:59,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:59,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 08:04:01,505 INFO [train.py:1039] (2/4) Epoch 19, batch 1600, loss[loss=0.1583, simple_loss=0.2398, pruned_loss=0.03838, over 21600.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2538, pruned_loss=0.05065, over 4731894.09 frames. ], batch size: 47, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:04:03,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:04:04,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:04,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:04:04,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:04:05,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:04:08,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:09,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 08:04:11,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 08:04:13,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 08:04:16,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:17,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 08:04:19,709 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.864e+02 2.063e+02 2.300e+02 3.333e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-30 08:04:19,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:04:20,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=648193.3333333334, ans=0.0 2023-09-30 08:04:21,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:04:26,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:04:28,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=648193.3333333334, ans=0.2 2023-09-30 08:04:30,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 08:04:31,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=648193.3333333334, ans=0.0 2023-09-30 08:04:33,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:04:34,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 08:04:34,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:36,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 08:04:40,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 08:04:47,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:49,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 08:04:49,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:51,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:51,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:04:52,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:04:55,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.67 vs. limit=6.0 2023-09-30 08:04:56,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:04:56,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:57,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:59,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:00,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:05:01,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:05:04,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:05:04,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:05:11,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:13,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:05:16,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 08:05:16,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:05:16,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 08:05:23,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:23,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:05:25,220 INFO [train.py:1039] (2/4) Epoch 19, batch 1650, loss[loss=0.1935, simple_loss=0.268, pruned_loss=0.0595, over 23669.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2546, pruned_loss=0.05106, over 4740313.21 frames. ], batch size: 85, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:05:25,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:05:25,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 08:05:25,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 08:05:25,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 08:05:26,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 08:05:30,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:31,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:31,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:05:31,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:05:33,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:36,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 08:05:39,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:05:39,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:39,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:05:39,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:05:42,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 08:05:42,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 08:05:46,098 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:05:48,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:05:50,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:05:58,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 08:06:01,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:03,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 08:06:06,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=648593.3333333334, ans=0.125 2023-09-30 08:06:07,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:09,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:06:11,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:06:11,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:12,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:06:12,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:15,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=648660.0, ans=0.2 2023-09-30 08:06:16,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:16,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:16,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=648660.0, ans=0.2 2023-09-30 08:06:18,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:18,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:19,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=648660.0, ans=0.0 2023-09-30 08:06:20,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:20,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:06:21,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:23,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 08:06:25,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:25,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 08:06:25,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 08:06:26,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 08:06:26,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:28,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:06:29,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:29,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:29,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 08:06:34,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:37,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:06:37,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:38,598 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.36 vs. limit=15.0 2023-09-30 08:06:40,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 08:06:46,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=648793.3333333334, ans=0.125 2023-09-30 08:06:47,420 INFO [train.py:1039] (2/4) Epoch 19, batch 1700, loss[loss=0.1873, simple_loss=0.2648, pruned_loss=0.05491, over 24095.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2534, pruned_loss=0.0514, over 4734871.61 frames. ], batch size: 86, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:06:47,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:47,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:06:47,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 08:06:48,479 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.01 vs. limit=22.5 2023-09-30 08:06:49,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:06:49,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:06:49,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:52,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:06:52,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:06:52,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 08:06:56,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:07:04,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:07:05,479 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.834e+02 2.122e+02 2.379e+02 4.054e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 08:07:05,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:07:11,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:07:12,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:12,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:07:13,292 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.58 vs. limit=22.5 2023-09-30 08:07:14,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:14,911 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.22 vs. limit=6.0 2023-09-30 08:07:17,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 08:07:19,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:07:19,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:22,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:07:23,111 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=648926.6666666666, ans=0.1 2023-09-30 08:07:24,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:07:24,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=648926.6666666666, ans=0.1 2023-09-30 08:07:26,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 08:07:26,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 08:07:28,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:29,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 08:07:29,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:07:40,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:07:40,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:41,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:44,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:07:44,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 08:07:44,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:47,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:47,187 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 08:07:48,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:07:48,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:48,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:48,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:07:51,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:51,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:07:54,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:54,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:07:54,161 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:07:59,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:00,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 08:08:02,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:03,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:06,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 08:08:09,789 INFO [train.py:1039] (2/4) Epoch 19, batch 1750, loss[loss=0.1761, simple_loss=0.2476, pruned_loss=0.05231, over 23356.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2525, pruned_loss=0.05156, over 4716076.99 frames. ], batch size: 105, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:08:11,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:14,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:14,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:08:14,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 08:08:16,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:08:19,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:08:19,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:24,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 08:08:26,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:26,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=649193.3333333334, ans=0.07 2023-09-30 08:08:27,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=649193.3333333334, ans=0.04949747468305833 2023-09-30 08:08:29,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 08:08:29,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:08:31,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:08:34,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:08:36,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 08:08:39,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:08:39,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 08:08:50,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:08:52,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:08:52,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:55,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:55,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:58,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:00,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:02,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:02,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:09:04,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 08:09:06,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:08,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=649326.6666666666, ans=0.1 2023-09-30 08:09:09,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 08:09:11,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:11,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:12,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:09:16,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=649393.3333333334, ans=0.125 2023-09-30 08:09:17,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:09:18,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 08:09:18,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:22,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:25,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:27,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:29,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:09:30,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 08:09:30,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:31,821 INFO [train.py:1039] (2/4) Epoch 19, batch 1800, loss[loss=0.1528, simple_loss=0.2301, pruned_loss=0.03773, over 24585.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.252, pruned_loss=0.05112, over 4714126.45 frames. ], batch size: 60, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:09:31,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:09:31,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:32,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:09:32,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:09:32,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:09:32,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=649460.0, ans=0.0 2023-09-30 08:09:37,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:09:38,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:40,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:09:43,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:45,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=649460.0, ans=0.125 2023-09-30 08:09:47,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:09:47,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:47,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=649526.6666666666, ans=0.1 2023-09-30 08:09:50,069 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.833e+02 2.073e+02 2.384e+02 3.418e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:09:50,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:09:53,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:53,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:54,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:09:58,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:58,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 08:09:58,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:03,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:06,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 08:10:09,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 08:10:10,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 08:10:10,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:10,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=649593.3333333334, ans=0.125 2023-09-30 08:10:11,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:10:11,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:11,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:10:11,962 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=649593.3333333334, ans=0.125 2023-09-30 08:10:20,246 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 08:10:21,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:10:23,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:24,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 08:10:24,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 08:10:26,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:10:26,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:10:28,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:10:33,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 08:10:35,247 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:10:39,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:10:41,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 08:10:41,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:10:42,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:42,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:10:44,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 08:10:47,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:10:47,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:10:51,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 08:10:51,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:53,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:10:54,645 INFO [train.py:1039] (2/4) Epoch 19, batch 1850, loss[loss=0.16, simple_loss=0.2353, pruned_loss=0.04238, over 21792.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2531, pruned_loss=0.05129, over 4719925.86 frames. ], batch size: 48, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:10:55,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:10:55,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:56,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:56,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:10:58,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=649793.3333333334, ans=0.0 2023-09-30 08:10:59,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:59,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:01,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:11:01,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:03,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=649793.3333333334, ans=0.2 2023-09-30 08:11:11,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:11:11,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 08:11:15,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 08:11:17,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 08:11:22,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:22,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 08:11:22,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 08:11:32,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:11:34,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 08:11:38,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:11:38,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:11:44,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 08:11:45,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:46,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:11:47,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:11:49,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:50,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:54,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:11:54,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:56,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:11:56,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:57,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:11:59,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:12:02,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 08:12:02,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:12:07,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:12:07,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:12:07,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 08:12:07,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 08:12:09,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 08:12:11,054 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 08:12:12,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:12:12,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:12:12,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:12,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:12,802 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 08:12:12,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:12:12,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:14,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:12:16,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:12:16,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:12:16,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 08:12:18,009 INFO [train.py:1039] (2/4) Epoch 19, batch 1900, loss[loss=0.2459, simple_loss=0.3047, pruned_loss=0.09356, over 19434.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2538, pruned_loss=0.05163, over 4716737.90 frames. ], batch size: 388, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:12:19,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:19,648 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 08:12:19,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:12:22,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:29,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:32,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:12:34,300 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 08:12:35,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.831e+02 2.039e+02 2.265e+02 4.223e+02, threshold=4.078e+02, percent-clipped=1.0 2023-09-30 08:12:35,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 08:12:36,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:36,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=650193.3333333334, ans=0.0 2023-09-30 08:12:37,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:12:37,554 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 08:12:37,606 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 08:12:40,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=650193.3333333334, ans=0.125 2023-09-30 08:12:42,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 08:12:43,658 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.32 vs. limit=15.0 2023-09-30 08:12:44,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:12:45,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=15.0 2023-09-30 08:12:47,999 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-09-30 08:12:48,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 08:12:52,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 08:13:00,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 08:13:05,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 08:13:05,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:05,662 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 08:13:05,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 08:13:05,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 08:13:07,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 08:13:07,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:13:10,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 08:13:14,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:13:16,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:16,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 08:13:20,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:13:23,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 08:13:23,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:30,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:13:30,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:13:30,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:13:32,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:13:33,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:13:33,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:13:34,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:13:38,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:38,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:13:40,086 INFO [train.py:1039] (2/4) Epoch 19, batch 1950, loss[loss=0.1929, simple_loss=0.2626, pruned_loss=0.06161, over 23601.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2543, pruned_loss=0.05158, over 4723988.99 frames. ], batch size: 256, lr: 5.44e-03, grad_scale: 8.0 2023-09-30 08:13:41,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:13:41,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:41,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:43,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:48,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:13:49,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:13:49,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:50,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:13:53,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 08:13:55,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:13:55,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:55,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:58,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:13:58,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:13:58,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:02,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:02,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=650526.6666666666, ans=0.04949747468305833 2023-09-30 08:14:03,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:14:03,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:14:03,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:14:03,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:06,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:11,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:14:11,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:11,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:14:11,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 08:14:12,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:14:13,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:14:13,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:18,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:20,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:14:25,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:14:28,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:14:28,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:14:30,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 08:14:30,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:14:33,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:35,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:14:35,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:14:39,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=12.0 2023-09-30 08:14:41,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:43,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:45,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:49,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:49,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=650726.6666666666, ans=0.125 2023-09-30 08:14:52,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:14:53,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:54,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 08:14:54,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:14:55,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:57,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 08:14:57,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:15:03,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:15:04,373 INFO [train.py:1039] (2/4) Epoch 19, batch 2000, loss[loss=0.1555, simple_loss=0.235, pruned_loss=0.03797, over 24296.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2552, pruned_loss=0.05202, over 4728426.75 frames. ], batch size: 56, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:15:04,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:15:04,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:04,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=650793.3333333334, ans=0.1 2023-09-30 08:15:05,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:15:07,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:09,390 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=650793.3333333334, ans=0.125 2023-09-30 08:15:10,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 08:15:12,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:15:16,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:15:18,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 08:15:18,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:15:18,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:15:18,815 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=650860.0, ans=0.05 2023-09-30 08:15:22,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:15:23,986 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.096e+02 2.439e+02 2.971e+02 4.515e+02, threshold=4.878e+02, percent-clipped=2.0 2023-09-30 08:15:24,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 08:15:25,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:27,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:28,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:30,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 08:15:30,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:15:32,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 08:15:32,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:36,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:15:39,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:15:39,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:39,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:42,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:15:42,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 08:15:44,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=650926.6666666666, ans=0.125 2023-09-30 08:15:45,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 08:15:45,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:45,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:15:51,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:51,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:15:51,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:52,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.73 vs. limit=15.0 2023-09-30 08:15:52,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:56,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:58,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:58,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:58,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:16:00,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:04,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:16:04,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 08:16:06,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=650993.3333333334, ans=0.125 2023-09-30 08:16:09,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:16:09,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:13,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:13,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:16:18,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:21,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:16:21,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:16:23,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:25,870 INFO [train.py:1039] (2/4) Epoch 19, batch 2050, loss[loss=0.1829, simple_loss=0.2611, pruned_loss=0.05231, over 23280.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2542, pruned_loss=0.05182, over 4720813.13 frames. ], batch size: 105, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:16:25,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:29,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=651126.6666666666, ans=0.1 2023-09-30 08:16:30,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:30,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:34,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:16:38,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:16:40,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:40,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:16:43,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 08:16:43,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:16:44,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:16:44,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:16:53,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:16:53,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:56,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 08:16:59,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:17:01,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 08:17:01,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:17:06,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:06,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:08,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:17:08,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:08,671 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=651260.0, ans=0.125 2023-09-30 08:17:10,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:17:12,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:17:12,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:17:15,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:16,324 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.40 vs. limit=22.5 2023-09-30 08:17:17,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:17:20,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:17:22,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:17:26,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:31,651 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:17:33,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 08:17:40,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:40,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:17:43,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:17:44,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 08:17:45,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=651393.3333333334, ans=0.0 2023-09-30 08:17:45,653 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.41 vs. limit=15.0 2023-09-30 08:17:50,111 INFO [train.py:1039] (2/4) Epoch 19, batch 2100, loss[loss=0.1851, simple_loss=0.2484, pruned_loss=0.06089, over 23637.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2531, pruned_loss=0.05135, over 4711240.60 frames. ], batch size: 256, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:17:50,306 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 08:17:50,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:17:50,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:50,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=651460.0, ans=0.125 2023-09-30 08:17:51,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:17:53,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:53,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 08:17:53,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 08:17:54,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:57,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=651460.0, ans=0.125 2023-09-30 08:17:58,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:17:58,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:18:00,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:01,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:18:01,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 08:18:01,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=651460.0, ans=0.125 2023-09-30 08:18:01,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=651460.0, ans=0.0 2023-09-30 08:18:03,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:18:04,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 08:18:04,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 08:18:06,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:07,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:07,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 08:18:07,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 08:18:09,218 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.858e+02 2.144e+02 2.529e+02 4.189e+02, threshold=4.288e+02, percent-clipped=0.0 2023-09-30 08:18:09,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=651526.6666666666, ans=0.0 2023-09-30 08:18:13,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 08:18:13,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:18:16,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:18:16,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:18:16,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=651526.6666666666, ans=0.125 2023-09-30 08:18:20,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:18:20,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=651526.6666666666, ans=0.02 2023-09-30 08:18:21,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 08:18:21,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:21,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 08:18:24,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 08:18:26,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:26,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 08:18:26,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 08:18:28,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 08:18:30,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:18:32,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:18:36,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:36,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:37,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:39,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:39,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 08:18:39,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:39,632 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:18:40,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:40,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:40,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 08:18:42,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 08:18:43,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 08:18:47,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:18:52,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:52,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 08:18:58,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:01,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:19:03,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:03,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:03,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:19:03,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:04,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:04,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:19:05,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:19:05,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:05,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.34 vs. limit=15.0 2023-09-30 08:19:08,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 08:19:09,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 08:19:09,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:11,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=651793.3333333334, ans=0.0 2023-09-30 08:19:12,641 INFO [train.py:1039] (2/4) Epoch 19, batch 2150, loss[loss=0.1875, simple_loss=0.2529, pruned_loss=0.06104, over 22786.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2519, pruned_loss=0.05101, over 4709456.57 frames. ], batch size: 322, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:19:12,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:19:12,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:19:12,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:19:12,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:19:16,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=651793.3333333334, ans=0.1 2023-09-30 08:19:19,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:19:22,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:22,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:24,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:19:24,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:25,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:19:27,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:27,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:19:27,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:19:32,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:32,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 08:19:39,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:41,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:19:41,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:42,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:19:43,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:44,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:44,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:45,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 08:19:47,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:19:48,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:48,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:49,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=651926.6666666666, ans=0.125 2023-09-30 08:19:50,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:52,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:19:55,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:55,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:19:57,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:57,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 08:19:57,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:20:01,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:01,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:03,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:04,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:20:04,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=651993.3333333334, ans=0.125 2023-09-30 08:20:05,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:07,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:07,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 08:20:08,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 08:20:09,425 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.43 vs. limit=15.0 2023-09-30 08:20:10,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:20:10,775 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 08:20:10,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:11,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=651993.3333333334, ans=0.125 2023-09-30 08:20:12,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:20:12,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 08:20:12,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:20:12,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 08:20:13,808 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 08:20:13,808 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 08:20:13,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 08:20:16,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:16,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:20:16,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:20:18,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:20,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:20:21,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:21,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:29,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:20:31,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 08:20:34,259 INFO [train.py:1039] (2/4) Epoch 19, batch 2200, loss[loss=0.1763, simple_loss=0.2486, pruned_loss=0.05194, over 23575.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2522, pruned_loss=0.05075, over 4714243.34 frames. ], batch size: 149, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:20:34,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=652126.6666666666, ans=0.0 2023-09-30 08:20:35,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:20:39,569 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=652126.6666666666, ans=0.125 2023-09-30 08:20:42,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:42,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:20:44,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:20:44,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:20:46,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:48,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:20:48,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 08:20:53,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 08:20:53,416 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=652193.3333333334, ans=0.125 2023-09-30 08:20:54,310 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.914e+02 2.227e+02 2.792e+02 4.256e+02, threshold=4.455e+02, percent-clipped=0.0 2023-09-30 08:20:55,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:21:01,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 08:21:02,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:03,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:04,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:21:09,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:21:09,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 08:21:12,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:21:13,206 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=652260.0, ans=0.0 2023-09-30 08:21:14,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:14,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 08:21:19,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:21:19,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:20,090 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=652260.0, ans=0.125 2023-09-30 08:21:22,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:21:22,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:24,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=652326.6666666666, ans=0.125 2023-09-30 08:21:26,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 08:21:27,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:28,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 08:21:32,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:32,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:21:32,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:34,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:34,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:35,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:35,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:36,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=652326.6666666666, ans=0.125 2023-09-30 08:21:37,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:21:37,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:21:40,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:21:40,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=652393.3333333334, ans=0.2 2023-09-30 08:21:42,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:21:42,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:21:45,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:21:47,411 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 08:21:48,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:21:49,058 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 08:21:51,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:21:52,935 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 08:21:54,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:54,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:21:56,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:57,584 INFO [train.py:1039] (2/4) Epoch 19, batch 2250, loss[loss=0.1878, simple_loss=0.2572, pruned_loss=0.05919, over 23763.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2534, pruned_loss=0.05114, over 4711635.47 frames. ], batch size: 212, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:21:57,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=652460.0, ans=0.0 2023-09-30 08:21:59,391 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 08:22:01,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:22:02,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:02,933 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=652460.0, ans=0.0 2023-09-30 08:22:07,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:22:09,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:22:11,461 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=652460.0, ans=0.125 2023-09-30 08:22:12,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:12,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:14,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:15,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 08:22:16,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:16,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:22:17,138 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=652526.6666666666, ans=0.1 2023-09-30 08:22:19,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 08:22:20,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:22:21,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:22,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:28,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:30,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:22:30,207 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=652593.3333333334, ans=0.0 2023-09-30 08:22:31,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:22:31,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 08:22:33,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:34,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:22:39,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:41,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:43,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:22:43,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:46,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=652660.0, ans=0.1 2023-09-30 08:22:47,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:50,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:22:55,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:22:58,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:23:05,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:23:06,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:23:06,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:23:11,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:23:13,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=652726.6666666666, ans=15.0 2023-09-30 08:23:16,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:23:16,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 08:23:16,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:16,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:23:19,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 08:23:21,029 INFO [train.py:1039] (2/4) Epoch 19, batch 2300, loss[loss=0.1768, simple_loss=0.2574, pruned_loss=0.04809, over 24468.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2547, pruned_loss=0.05162, over 4711104.91 frames. ], batch size: 63, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:23:22,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:23:22,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:28,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:29,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:23:32,492 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 08:23:34,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:40,217 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.880e+02 2.084e+02 2.491e+02 4.260e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 08:23:42,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:23:42,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:23:42,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:23:42,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:42,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 08:23:44,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:23:44,961 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=652860.0, ans=0.125 2023-09-30 08:23:47,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:23:47,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:23:51,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:23:55,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:23:58,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:24:03,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:24:03,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:24:07,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:24:07,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:24:09,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=652993.3333333334, ans=0.2 2023-09-30 08:24:10,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:24:11,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:24:11,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:24:11,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 08:24:18,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:24:18,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:18,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:24:18,444 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:24:18,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:20,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:24:20,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:24:20,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 08:24:21,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:24:21,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:21,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 08:24:24,585 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.46 vs. limit=15.0 2023-09-30 08:24:29,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:24:32,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:24:36,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:36,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:24:36,981 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=653060.0, ans=0.0 2023-09-30 08:24:38,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:24:38,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:24:38,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=653060.0, ans=0.1 2023-09-30 08:24:39,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:24:39,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:24:41,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 08:24:42,271 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.79 vs. limit=6.0 2023-09-30 08:24:42,930 INFO [train.py:1039] (2/4) Epoch 19, batch 2350, loss[loss=0.2458, simple_loss=0.3025, pruned_loss=0.09455, over 19824.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2555, pruned_loss=0.05235, over 4705039.78 frames. ], batch size: 388, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:24:46,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:24:46,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 08:24:55,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 08:24:56,567 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.61 vs. limit=22.5 2023-09-30 08:24:57,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:25:00,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:00,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:01,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:01,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:03,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 08:25:07,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:25:12,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=653193.3333333334, ans=0.2 2023-09-30 08:25:13,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 08:25:14,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:16,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:25:16,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:25:19,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:25:21,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 08:25:22,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:25:22,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:23,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:23,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:25:28,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:25:30,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 08:25:32,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:25:35,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:35,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:25:38,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 08:25:38,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:25:42,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 08:25:43,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:25:48,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 08:25:54,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 08:25:55,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:55,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 08:25:55,570 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 08:25:55,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 08:25:57,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 08:25:59,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:26:04,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:26:06,065 INFO [train.py:1039] (2/4) Epoch 19, batch 2400, loss[loss=0.1828, simple_loss=0.2466, pruned_loss=0.05953, over 23751.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2553, pruned_loss=0.0524, over 4706042.56 frames. ], batch size: 212, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:26:10,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:26:13,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:26:13,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 08:26:13,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 08:26:19,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:26:19,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:26:22,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 08:26:22,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:26:22,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:24,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 08:26:25,937 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.937e+02 2.110e+02 2.328e+02 3.835e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-30 08:26:26,481 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=653526.6666666666, ans=0.125 2023-09-30 08:26:29,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:32,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 08:26:37,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:26:43,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 08:26:44,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:26:47,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:53,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:26:54,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 08:26:54,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:27:02,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:04,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:07,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:08,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:27:08,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:27:08,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:27:08,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:11,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:27:17,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:27:18,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:27:18,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 08:27:20,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 08:27:22,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:27:23,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:23,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 08:27:23,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 08:27:23,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 08:27:23,820 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 08:27:25,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 08:27:26,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:27:29,767 INFO [train.py:1039] (2/4) Epoch 19, batch 2450, loss[loss=0.1721, simple_loss=0.2584, pruned_loss=0.04296, over 24433.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2542, pruned_loss=0.05229, over 4700449.01 frames. ], batch size: 69, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:27:29,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:29,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:31,394 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 08:27:31,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:32,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:27:35,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:27:35,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:36,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=653793.3333333334, ans=0.125 2023-09-30 08:27:39,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:39,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:40,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 08:27:42,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=653793.3333333334, ans=0.2 2023-09-30 08:27:45,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:45,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:51,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:27:51,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:27:51,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:27:52,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 08:27:55,250 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.11 vs. limit=15.0 2023-09-30 08:27:57,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:59,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:27:59,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:28:03,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=653926.6666666666, ans=0.125 2023-09-30 08:28:04,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:28:04,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:06,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:07,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:28:09,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 08:28:10,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:28:18,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:20,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:28:20,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:20,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:28:20,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:24,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:28:24,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 08:28:28,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:28,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:28:31,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:28:31,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:31,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=653993.3333333334, ans=0.0 2023-09-30 08:28:37,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:28:37,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 08:28:39,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:28:39,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:28:39,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 08:28:41,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:28:41,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:28:47,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:28:48,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=654060.0, ans=0.125 2023-09-30 08:28:50,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:50,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:28:50,472 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=654126.6666666666, ans=0.125 2023-09-30 08:28:51,555 INFO [train.py:1039] (2/4) Epoch 19, batch 2500, loss[loss=0.1896, simple_loss=0.2715, pruned_loss=0.05384, over 23935.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2538, pruned_loss=0.05206, over 4705922.93 frames. ], batch size: 86, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:28:53,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 08:28:55,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:29:01,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:01,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=654126.6666666666, ans=0.1 2023-09-30 08:29:11,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:29:11,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:29:12,731 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.785e+02 1.970e+02 2.171e+02 3.825e+02, threshold=3.939e+02, percent-clipped=0.0 2023-09-30 08:29:12,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:12,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 08:29:20,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:29:22,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:29:22,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:29:22,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:29:22,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 08:29:25,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:25,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 08:29:25,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:26,116 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.20 vs. limit=6.0 2023-09-30 08:29:26,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 08:29:26,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:29:32,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:36,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:29:36,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 08:29:36,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:29:38,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:41,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:47,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:52,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:29:56,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:29:59,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 08:29:59,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:00,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:01,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=654393.3333333334, ans=0.125 2023-09-30 08:30:03,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:30:03,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:30:03,927 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 08:30:03,928 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 08:30:03,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 08:30:09,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:11,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 08:30:11,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 08:30:11,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:30:12,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 08:30:15,571 INFO [train.py:1039] (2/4) Epoch 19, batch 2550, loss[loss=0.1611, simple_loss=0.245, pruned_loss=0.03865, over 24288.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2534, pruned_loss=0.05145, over 4701287.11 frames. ], batch size: 61, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:30:15,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 08:30:17,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:20,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:30:20,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:30:23,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:25,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 08:30:25,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:30:30,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 08:30:31,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:30:33,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:34,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:36,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 08:30:37,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:30:37,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:37,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:39,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:30:41,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 08:30:41,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:41,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:41,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 08:30:51,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:30:58,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:30:58,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:58,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:59,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:31:05,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:31:08,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:31:08,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:31:10,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:31:10,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:31:10,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:31:15,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:15,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:18,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=654660.0, ans=0.0 2023-09-30 08:31:23,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:31:23,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 08:31:23,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:31:23,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:25,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:31:25,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:31:26,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:33,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.77 vs. limit=6.0 2023-09-30 08:31:34,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:31:35,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:37,203 INFO [train.py:1039] (2/4) Epoch 19, batch 2600, loss[loss=0.173, simple_loss=0.2605, pruned_loss=0.04275, over 24425.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2547, pruned_loss=0.05255, over 4681867.86 frames. ], batch size: 69, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:31:40,326 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 08:31:43,946 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 08:31:43,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:31:44,038 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 08:31:44,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 08:31:45,521 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 08:31:48,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:48,625 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 08:31:50,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 08:31:50,825 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 08:31:54,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:31:54,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 08:31:56,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 08:31:57,364 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.901e+02 2.112e+02 2.544e+02 3.828e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-30 08:31:58,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:31:59,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 08:32:02,608 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 08:32:03,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 08:32:05,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=654860.0, ans=0.0 2023-09-30 08:32:10,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:11,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:11,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:11,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 08:32:14,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:32:19,940 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 08:32:26,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:26,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:28,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 08:32:28,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:28,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:29,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 08:32:31,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:32:33,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:32:34,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:37,925 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 08:32:39,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:39,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:32:44,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:45,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:32:45,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 08:32:45,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:47,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:32:48,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:32:49,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=655060.0, ans=0.07 2023-09-30 08:32:54,163 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:32:55,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 08:32:56,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:58,345 INFO [train.py:1039] (2/4) Epoch 19, batch 2650, loss[loss=0.1629, simple_loss=0.2426, pruned_loss=0.04162, over 24341.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2546, pruned_loss=0.05249, over 4697459.05 frames. ], batch size: 61, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:32:58,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:33:00,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=655126.6666666666, ans=0.125 2023-09-30 08:33:02,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 08:33:02,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:04,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:33:06,509 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 08:33:06,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:08,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:12,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:33:14,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:33:15,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:33:17,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 08:33:17,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:33:17,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:33:20,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 08:33:23,364 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 08:33:25,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:27,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 08:33:27,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:28,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 08:33:30,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:30,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:33:30,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:32,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:36,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 08:33:37,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 08:33:39,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:33:42,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 08:33:42,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:43,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=655260.0, ans=0.0 2023-09-30 08:33:44,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:44,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:33:45,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:47,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:49,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=655326.6666666666, ans=0.2 2023-09-30 08:33:50,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:51,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:33:53,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:53,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:33:54,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:33:57,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:58,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:34:00,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:01,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:02,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:34:06,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:06,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:34:06,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:08,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 08:34:12,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:34:13,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:15,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:15,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:17,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:34:17,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:20,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:20,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 08:34:21,738 INFO [train.py:1039] (2/4) Epoch 19, batch 2700, loss[loss=0.1927, simple_loss=0.2532, pruned_loss=0.06613, over 22491.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2553, pruned_loss=0.05249, over 4694646.91 frames. ], batch size: 322, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:34:21,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:34:23,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:34:25,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:34:25,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:26,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:28,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:34:28,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:28,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:34:28,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:34:29,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 08:34:30,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:34:32,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:34:33,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:34:35,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:39,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:34:41,036 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.869e+02 2.051e+02 2.484e+02 4.492e+02, threshold=4.101e+02, percent-clipped=1.0 2023-09-30 08:34:41,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 08:34:41,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:34:46,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:34:46,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:34:53,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:34:53,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:53,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:34:53,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:34:57,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:59,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:00,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:35:00,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:05,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:05,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:35:07,668 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.67 vs. limit=15.0 2023-09-30 08:35:14,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:35:16,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:35:21,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:35:21,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:26,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:26,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:27,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.89 vs. limit=10.0 2023-09-30 08:35:27,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:28,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:29,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:29,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:35:34,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:35:34,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:34,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:37,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 08:35:40,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:40,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:35:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 08:35:42,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 08:35:42,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:43,944 INFO [train.py:1039] (2/4) Epoch 19, batch 2750, loss[loss=0.188, simple_loss=0.2712, pruned_loss=0.05238, over 23715.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2546, pruned_loss=0.05252, over 4685236.76 frames. ], batch size: 85, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:35:45,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:35:47,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:49,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:49,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:35:50,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=655793.3333333334, ans=0.0 2023-09-30 08:35:51,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:51,790 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=655793.3333333334, ans=0.1 2023-09-30 08:35:53,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:35:54,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:35:54,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:35:54,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:54,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 08:35:54,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:54,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:59,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 08:36:01,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:36:01,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=655860.0, ans=0.0 2023-09-30 08:36:02,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:03,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:04,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:36:04,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:36:06,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:36:07,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:08,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:12,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:36:12,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:36:12,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=655860.0, ans=0.1 2023-09-30 08:36:13,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:36:15,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:16,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:36:24,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=655926.6666666666, ans=0.1 2023-09-30 08:36:25,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:27,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:36:27,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:35,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:35,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:36:35,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:36:43,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:36:44,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:44,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 08:36:46,748 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.21 vs. limit=10.0 2023-09-30 08:36:47,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:49,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 08:36:52,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=656060.0, ans=0.125 2023-09-30 08:36:53,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:36:57,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:36:59,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 08:36:59,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:01,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:37:01,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 08:37:03,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:37:07,167 INFO [train.py:1039] (2/4) Epoch 19, batch 2800, loss[loss=0.1781, simple_loss=0.258, pruned_loss=0.04909, over 24386.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2531, pruned_loss=0.05185, over 4685175.11 frames. ], batch size: 77, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:37:07,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:37:07,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:08,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:08,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 08:37:08,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:10,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:11,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:11,964 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 08:37:11,965 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 08:37:15,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:16,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:37:16,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:37:20,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:37:20,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=656126.6666666666, ans=0.1 2023-09-30 08:37:21,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 08:37:23,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:37:23,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 08:37:24,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:26,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:37:26,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:28,349 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.835e+02 2.025e+02 2.355e+02 3.473e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 08:37:30,846 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.66 vs. limit=15.0 2023-09-30 08:37:32,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:32,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:32,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:37:32,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:37:41,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:37:42,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:45,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:47,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:47,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:52,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:37:52,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 08:37:52,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:52,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:52,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:37:58,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:58,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:02,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:38:04,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:38:04,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:04,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:38:06,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:38:07,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:38:08,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:38:08,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 08:38:08,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:10,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:38:10,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:11,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 08:38:11,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:11,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:38:13,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:38:14,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 08:38:16,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=656393.3333333334, ans=0.1 2023-09-30 08:38:19,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:38:19,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:38:21,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:38:24,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:27,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:38:27,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:38:28,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:38:30,238 INFO [train.py:1039] (2/4) Epoch 19, batch 2850, loss[loss=0.1824, simple_loss=0.2509, pruned_loss=0.05699, over 23668.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2519, pruned_loss=0.05202, over 4676167.97 frames. ], batch size: 232, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:38:31,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:31,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:36,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:38:36,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 08:38:44,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 08:38:44,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:45,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 08:38:45,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:48,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 08:38:50,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 08:38:51,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:56,869 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=656526.6666666666, ans=0.2 2023-09-30 08:39:04,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:05,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:05,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:39:07,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:39:07,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:39:07,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:39:09,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:39:09,593 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=656593.3333333334, ans=0.1 2023-09-30 08:39:10,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 08:39:14,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:39:14,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:15,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:16,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:19,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:19,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:20,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:22,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:22,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:39:24,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:24,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:24,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=656660.0, ans=0.0 2023-09-30 08:39:25,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:39:31,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:39:32,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=656660.0, ans=0.125 2023-09-30 08:39:33,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 08:39:33,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 08:39:35,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:39:35,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:35,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 08:39:36,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:39:38,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:38,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:39,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:39:39,481 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 08:39:39,550 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 08:39:39,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:39:39,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:46,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:39:46,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:48,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:49,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=656726.6666666666, ans=0.125 2023-09-30 08:39:49,068 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:39:50,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 08:39:51,775 INFO [train.py:1039] (2/4) Epoch 19, batch 2900, loss[loss=0.1848, simple_loss=0.2706, pruned_loss=0.04956, over 24653.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2528, pruned_loss=0.05192, over 4693113.86 frames. ], batch size: 73, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:39:53,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:54,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 08:39:55,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 08:39:56,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:39:56,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:39:59,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:01,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:40:01,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=656793.3333333334, ans=0.0 2023-09-30 08:40:01,374 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=656793.3333333334, ans=0.2 2023-09-30 08:40:04,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:40:05,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:40:08,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:40:08,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 08:40:09,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.04 vs. limit=15.0 2023-09-30 08:40:10,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:40:11,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:12,088 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=656860.0, ans=0.125 2023-09-30 08:40:13,073 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.848e+02 2.073e+02 2.444e+02 4.000e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:40:14,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 08:40:16,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 08:40:19,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:40:19,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 08:40:19,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:40:20,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=656860.0, ans=0.125 2023-09-30 08:40:23,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:40:23,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:40:27,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:29,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:32,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:40:35,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:40:37,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 08:40:37,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 08:40:37,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:40:41,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:40:44,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 08:40:46,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:40:51,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:41:00,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:41:01,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:41:01,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 08:41:05,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:05,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 08:41:05,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:05,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:41:12,554 INFO [train.py:1039] (2/4) Epoch 19, batch 2950, loss[loss=0.1756, simple_loss=0.2483, pruned_loss=0.0515, over 23468.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2531, pruned_loss=0.05149, over 4703726.36 frames. ], batch size: 134, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:41:12,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:15,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 08:41:17,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:17,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:18,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:41:20,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:41:21,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 08:41:23,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 08:41:23,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:41:23,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:30,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:33,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:41:35,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:41:35,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:39,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:41:41,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:41:41,413 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=657193.3333333334, ans=0.0 2023-09-30 08:41:42,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:41:48,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 08:41:53,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 08:41:53,515 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 08:41:53,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=657260.0, ans=0.2 2023-09-30 08:41:54,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:41:56,478 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 08:41:58,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 08:41:58,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:58,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:58,335 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 08:41:58,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:41:58,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=657260.0, ans=0.125 2023-09-30 08:42:01,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 08:42:03,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:42:03,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:42:05,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:07,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:42:07,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:07,193 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 08:42:07,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:07,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 08:42:12,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=657326.6666666666, ans=0.2 2023-09-30 08:42:15,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:17,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:18,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 08:42:18,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:42:20,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 08:42:22,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:23,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:42:25,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:42:26,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:26,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:42:27,202 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:42:28,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:42:29,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:29,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:42:29,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:42:31,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:31,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:42:32,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:34,277 INFO [train.py:1039] (2/4) Epoch 19, batch 3000, loss[loss=0.1715, simple_loss=0.2546, pruned_loss=0.04418, over 24421.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2542, pruned_loss=0.05193, over 4700105.56 frames. ], batch size: 69, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:42:34,278 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 08:42:48,932 INFO [train.py:1071] (2/4) Epoch 19, validation: loss=0.3515, simple_loss=0.275, pruned_loss=0.214, over 1125622.00 frames. 2023-09-30 08:42:48,933 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 08:42:49,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 08:42:50,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:52,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:42:52,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:42:55,295 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 08:42:55,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 08:42:57,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:57,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:42:57,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=657460.0, ans=0.0 2023-09-30 08:42:58,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 08:42:59,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:01,567 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=657460.0, ans=0.1 2023-09-30 08:43:04,600 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=657526.6666666666, ans=0.125 2023-09-30 08:43:07,291 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:43:11,618 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.829e+02 2.117e+02 2.474e+02 3.888e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 08:43:16,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:43:19,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=657526.6666666666, ans=0.0 2023-09-30 08:43:25,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 08:43:25,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:43:27,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:43:27,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:28,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:43:30,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:30,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 08:43:33,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 08:43:35,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:43:35,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:43:37,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:43:37,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:37,809 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.47 vs. limit=15.0 2023-09-30 08:43:38,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:38,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:43:43,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:43:43,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:43,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:43:45,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:48,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 08:43:50,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:43:50,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:43:51,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:43:55,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:55,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:59,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:43:59,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 08:43:59,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:43:59,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 08:44:00,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:44:02,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 08:44:04,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:04,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:44:04,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=657726.6666666666, ans=0.1 2023-09-30 08:44:05,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 08:44:05,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 08:44:05,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:44:07,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:44:07,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:44:07,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:44:07,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:09,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:44:12,053 INFO [train.py:1039] (2/4) Epoch 19, batch 3050, loss[loss=0.1857, simple_loss=0.2553, pruned_loss=0.05809, over 23854.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2554, pruned_loss=0.05269, over 4691197.15 frames. ], batch size: 195, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:44:13,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 08:44:15,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:16,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:16,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:44:20,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:22,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 08:44:32,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 08:44:32,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 08:44:32,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:36,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:44:41,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:42,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:43,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:46,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:44:46,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:46,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:48,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:48,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:48,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:48,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=657926.6666666666, ans=0.1 2023-09-30 08:44:48,784 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=657926.6666666666, ans=0.0 2023-09-30 08:44:50,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:51,737 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=657926.6666666666, ans=0.1 2023-09-30 08:44:53,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:53,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 08:44:53,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:53,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:44:58,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:58,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:44:59,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:00,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:06,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:45:06,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:16,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:16,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:45:16,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:45:19,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:19,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:45:19,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:45:21,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 08:45:22,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:22,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:23,247 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.03 vs. limit=12.0 2023-09-30 08:45:24,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 08:45:27,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:33,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:33,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=658126.6666666666, ans=0.0 2023-09-30 08:45:34,283 INFO [train.py:1039] (2/4) Epoch 19, batch 3100, loss[loss=0.1931, simple_loss=0.2539, pruned_loss=0.06617, over 19587.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2544, pruned_loss=0.05243, over 4694207.70 frames. ], batch size: 388, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:45:35,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:45:36,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:45:39,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 08:45:39,953 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=658126.6666666666, ans=0.0 2023-09-30 08:45:42,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 08:45:42,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 08:45:44,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:45:44,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=658126.6666666666, ans=0.1 2023-09-30 08:45:47,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:47,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:52,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:45:55,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:56,627 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.813e+02 2.094e+02 2.454e+02 3.292e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 08:46:01,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 08:46:05,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:46:05,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:07,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:07,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:46:09,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:46:10,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:46:10,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 08:46:10,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:46:12,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:12,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 08:46:14,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:46:20,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:46:20,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 08:46:22,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 08:46:23,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:25,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:26,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:26,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:26,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:46:28,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:46:28,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:46:28,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=658326.6666666666, ans=0.0 2023-09-30 08:46:30,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:46:30,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:46:30,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:30,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 08:46:34,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:36,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 08:46:39,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=658393.3333333334, ans=0.125 2023-09-30 08:46:41,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:46:42,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 08:46:42,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:43,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:43,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 08:46:54,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 08:46:55,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.11 vs. limit=15.0 2023-09-30 08:46:56,197 INFO [train.py:1039] (2/4) Epoch 19, batch 3150, loss[loss=0.1626, simple_loss=0.2258, pruned_loss=0.04969, over 23947.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2533, pruned_loss=0.05196, over 4690301.17 frames. ], batch size: 195, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:46:58,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:46:59,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:01,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:47:01,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:47:01,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 08:47:03,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:03,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:47:05,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 08:47:06,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:10,839 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 08:47:13,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 08:47:13,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:47:16,072 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 08:47:16,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:47:18,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 08:47:19,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 08:47:19,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 08:47:19,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:19,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:20,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:22,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 08:47:23,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:25,602 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=658526.6666666666, ans=0.025 2023-09-30 08:47:26,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:47:30,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 08:47:31,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:47:33,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:47:33,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:35,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 08:47:36,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 08:47:38,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:47:38,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 08:47:39,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:47:39,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:39,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:47:39,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=658593.3333333334, ans=0.1 2023-09-30 08:47:44,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:47:44,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:47:45,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 08:47:47,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:47:47,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:49,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:47:49,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:49,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 08:47:49,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:51,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 08:47:52,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:52,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 08:47:54,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 08:47:55,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:47:55,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:57,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 08:48:00,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:48:00,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:48:03,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:48:04,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:06,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:48:11,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:48:11,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:14,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:48:15,361 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-09-30 08:48:20,483 INFO [train.py:1039] (2/4) Epoch 19, batch 3200, loss[loss=0.1628, simple_loss=0.2318, pruned_loss=0.0469, over 23713.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2519, pruned_loss=0.05125, over 4687988.90 frames. ], batch size: 232, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:48:20,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:48:20,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:48:24,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:24,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:48:24,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 08:48:25,156 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.16 vs. limit=22.5 2023-09-30 08:48:27,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:48:32,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:48:34,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=658793.3333333334, ans=0.125 2023-09-30 08:48:35,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:43,537 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.911e+02 2.162e+02 2.460e+02 4.180e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-30 08:48:43,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:48:48,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=658860.0, ans=0.125 2023-09-30 08:48:51,345 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=658860.0, ans=0.1 2023-09-30 08:48:56,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 08:48:57,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:49:00,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 08:49:02,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:49:04,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=658926.6666666666, ans=0.125 2023-09-30 08:49:05,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:49:05,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:49:06,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:49:09,258 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=15.0 2023-09-30 08:49:11,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 08:49:13,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:49:13,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=658993.3333333334, ans=0.1 2023-09-30 08:49:14,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 08:49:16,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 08:49:20,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:49:26,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:26,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:49:27,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:28,536 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 08:49:28,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:49:33,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:49:35,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 08:49:36,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 08:49:36,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 08:49:38,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 08:49:41,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:49:42,867 INFO [train.py:1039] (2/4) Epoch 19, batch 3250, loss[loss=0.1713, simple_loss=0.2389, pruned_loss=0.0519, over 23793.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2519, pruned_loss=0.05092, over 4702252.61 frames. ], batch size: 164, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:49:43,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:49:44,466 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 08:49:44,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:49:44,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:49:46,076 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 08:49:50,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:49:54,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:50:04,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:04,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 08:50:05,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:05,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:50:05,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:06,442 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.00 vs. limit=15.0 2023-09-30 08:50:07,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:07,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:50:10,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:10,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:50:11,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:12,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:15,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:17,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:18,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:18,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:20,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:20,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:20,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:22,107 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=659260.0, ans=10.0 2023-09-30 08:50:26,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 08:50:26,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:50:26,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:50:28,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:30,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:50:37,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:50:38,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.14 vs. limit=10.0 2023-09-30 08:50:45,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:50:47,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:47,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 08:50:47,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:50:47,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:50:47,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:48,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 08:50:50,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 08:50:50,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:51,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:53,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:53,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:50:53,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:58,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:58,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:01,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 08:51:01,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:03,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:51:03,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 08:51:05,722 INFO [train.py:1039] (2/4) Epoch 19, batch 3300, loss[loss=0.186, simple_loss=0.2527, pruned_loss=0.05967, over 23813.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2528, pruned_loss=0.05108, over 4712085.96 frames. ], batch size: 164, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:51:07,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:51:07,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 08:51:07,887 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.44 vs. limit=12.0 2023-09-30 08:51:08,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 08:51:10,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 08:51:10,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:16,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:18,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:51:19,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:19,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:51:22,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:51:25,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:25,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:51:25,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=659526.6666666666, ans=0.0 2023-09-30 08:51:27,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=659526.6666666666, ans=0.1 2023-09-30 08:51:28,244 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.763e+02 1.973e+02 2.210e+02 4.562e+02, threshold=3.946e+02, percent-clipped=1.0 2023-09-30 08:51:29,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 08:51:29,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:51:30,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:30,637 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.71 vs. limit=12.0 2023-09-30 08:51:32,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:32,903 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 08:51:35,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:51:37,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:51:37,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:51:37,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:51:38,686 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 08:51:43,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:43,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:51:44,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:44,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 08:51:46,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 08:51:46,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:47,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:51:50,090 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 08:51:51,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 08:51:51,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:51:55,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 08:51:55,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:51:59,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:52:01,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:02,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:02,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:02,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:52:02,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:52:05,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:52:06,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:06,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:52:07,759 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 08:52:09,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 08:52:11,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:52:11,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:11,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:11,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=659726.6666666666, ans=0.125 2023-09-30 08:52:13,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:13,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:13,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=659726.6666666666, ans=0.125 2023-09-30 08:52:15,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:52:16,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:16,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:52:16,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:17,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=659726.6666666666, ans=0.0 2023-09-30 08:52:19,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:52:22,442 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=659726.6666666666, ans=0.1 2023-09-30 08:52:23,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 08:52:23,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:25,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:27,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:52:28,529 INFO [train.py:1039] (2/4) Epoch 19, batch 3350, loss[loss=0.1742, simple_loss=0.2612, pruned_loss=0.0436, over 24438.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2537, pruned_loss=0.05158, over 4717118.54 frames. ], batch size: 69, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:52:28,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:52:28,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:30,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:30,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:33,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:33,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:34,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:52:36,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:37,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=659793.3333333334, ans=0.0 2023-09-30 08:52:38,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:52:39,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:41,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:52:43,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 08:52:44,935 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 08:52:46,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:48,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 08:52:48,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 08:52:50,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:52:50,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:52:51,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:51,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 08:52:54,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:54,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:52:56,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=659860.0, ans=0.1 2023-09-30 08:52:57,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:58,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:58,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:00,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:53:01,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=659926.6666666666, ans=0.05 2023-09-30 08:53:03,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:04,095 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=659926.6666666666, ans=0.2 2023-09-30 08:53:06,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:07,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:11,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:53:12,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:14,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:14,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=659926.6666666666, ans=0.1 2023-09-30 08:53:16,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:18,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:21,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 08:53:21,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:53:21,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 08:53:21,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:53:23,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 08:53:24,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:28,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:34,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:34,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 08:53:35,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:53:37,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:53:39,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:53:44,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:53:46,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 08:53:46,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:53:47,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:53:49,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:49,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 08:53:49,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:49,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 08:53:51,445 INFO [train.py:1039] (2/4) Epoch 19, batch 3400, loss[loss=0.2012, simple_loss=0.2818, pruned_loss=0.06024, over 23960.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2546, pruned_loss=0.05152, over 4728017.98 frames. ], batch size: 86, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:53:51,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:51,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:53,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:53:54,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:53:54,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 08:53:58,339 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.09 vs. limit=6.0 2023-09-30 08:54:01,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 08:54:01,305 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 08:54:01,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:06,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:54:06,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:54:06,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:07,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:54:14,448 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.902e+02 2.101e+02 2.445e+02 3.700e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-30 08:54:14,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:16,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 08:54:20,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:54:23,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:23,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:23,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:54:27,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=660260.0, ans=0.2 2023-09-30 08:54:31,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:54:37,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 08:54:42,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:42,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:44,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 08:54:44,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:54:45,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:46,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:48,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:54:50,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:53,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:54:53,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:55:00,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:03,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 08:55:08,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:55:11,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 08:55:13,192 INFO [train.py:1039] (2/4) Epoch 19, batch 3450, loss[loss=0.1658, simple_loss=0.2527, pruned_loss=0.03945, over 24618.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2541, pruned_loss=0.05137, over 4719431.52 frames. ], batch size: 68, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:55:13,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=660460.0, ans=0.1 2023-09-30 08:55:16,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 08:55:18,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:55:19,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:55:19,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 08:55:21,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:25,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:55:28,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=660526.6666666666, ans=0.2 2023-09-30 08:55:29,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:55:31,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:31,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:55:31,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:35,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:41,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 08:55:43,792 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.85 vs. limit=10.0 2023-09-30 08:55:48,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 08:55:48,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:55:48,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:55:51,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:53,683 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:55:56,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 08:55:56,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:55:57,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=660593.3333333334, ans=0.1 2023-09-30 08:55:58,827 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=660593.3333333334, ans=0.0 2023-09-30 08:56:01,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:01,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:56:02,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:56:04,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:56:06,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 08:56:06,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:08,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:56:11,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:14,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 08:56:18,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:56:22,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:56:24,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:28,360 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:29,064 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.50 vs. limit=10.0 2023-09-30 08:56:32,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:32,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:34,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:56:34,426 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:35,710 INFO [train.py:1039] (2/4) Epoch 19, batch 3500, loss[loss=0.1757, simple_loss=0.237, pruned_loss=0.05718, over 23633.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2525, pruned_loss=0.05155, over 4694863.03 frames. ], batch size: 256, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:56:37,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=660793.3333333334, ans=0.125 2023-09-30 08:56:38,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:42,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:56:42,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 08:56:45,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:56:46,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=660793.3333333334, ans=0.2 2023-09-30 08:56:47,635 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=660793.3333333334, ans=0.035 2023-09-30 08:56:48,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 08:56:51,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:53,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 08:56:56,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:56:58,822 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.797e+02 1.956e+02 2.209e+02 3.007e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 08:56:59,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:59,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:57:01,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:01,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:57:01,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:01,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:01,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 08:57:04,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:05,250 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.15 vs. limit=15.0 2023-09-30 08:57:05,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:57:07,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:12,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:12,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 08:57:13,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:15,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:17,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=660926.6666666666, ans=0.125 2023-09-30 08:57:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:57:20,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:21,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:57:21,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:23,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 08:57:23,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 08:57:25,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 08:57:25,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:26,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:28,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:28,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:57:28,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=660993.3333333334, ans=0.1 2023-09-30 08:57:31,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:57:33,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:57:34,951 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=660993.3333333334, ans=0.1 2023-09-30 08:57:37,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:57:39,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 08:57:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 08:57:40,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:57:42,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:42,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:45,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:49,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 08:57:50,734 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:52,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:53,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 08:57:56,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 08:57:56,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:58,252 INFO [train.py:1039] (2/4) Epoch 19, batch 3550, loss[loss=0.175, simple_loss=0.2427, pruned_loss=0.0537, over 23401.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2518, pruned_loss=0.05139, over 4705963.05 frames. ], batch size: 285, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:57:58,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:58,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:00,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:00,242 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=661126.6666666666, ans=0.0 2023-09-30 08:58:03,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:58:14,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:15,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:58:20,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:20,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:58:22,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:22,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:58:23,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:58:25,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:27,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:58:27,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:27,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:58:27,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=661193.3333333334, ans=0.0 2023-09-30 08:58:28,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:58:34,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:58:34,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:36,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:36,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:38,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:58:38,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 08:58:38,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:40,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:41,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:58:42,446 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.37 vs. limit=12.0 2023-09-30 08:58:48,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:49,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:49,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:50,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 08:58:52,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:58:52,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 08:58:52,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:55,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:58:55,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:58:59,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=661326.6666666666, ans=0.95 2023-09-30 08:59:00,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 08:59:00,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:06,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:06,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 08:59:07,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:11,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:59:12,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 08:59:21,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 08:59:21,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:59:21,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:59:22,680 INFO [train.py:1039] (2/4) Epoch 19, batch 3600, loss[loss=0.1696, simple_loss=0.2536, pruned_loss=0.04277, over 24484.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2518, pruned_loss=0.05149, over 4688000.93 frames. ], batch size: 66, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:59:22,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:24,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:24,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:59:28,284 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.83 vs. limit=22.5 2023-09-30 08:59:29,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:31,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:32,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:59:32,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:59:34,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:34,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 08:59:35,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:59:37,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:37,532 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=661526.6666666666, ans=0.0 2023-09-30 08:59:39,473 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.21 vs. limit=6.0 2023-09-30 08:59:41,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:43,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:45,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:59:45,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:45,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 08:59:46,963 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.820e+02 2.003e+02 2.240e+02 3.370e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-30 08:59:47,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:50,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:50,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:59:53,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:55,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:56,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:59:57,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 09:00:02,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=661593.3333333334, ans=0.025 2023-09-30 09:00:05,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:06,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:00:06,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 09:00:12,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:00:16,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=661660.0, ans=0.0 2023-09-30 09:00:19,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:22,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:24,752 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=661660.0, ans=0.0 2023-09-30 09:00:27,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=661726.6666666666, ans=0.05 2023-09-30 09:00:28,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:00:28,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:00:28,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 09:00:31,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 09:00:32,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 09:00:34,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:00:34,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:00:35,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 09:00:37,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:00:37,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:00:37,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:38,949 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.80 vs. limit=15.0 2023-09-30 09:00:39,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 09:00:39,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 09:00:44,168 INFO [train.py:1039] (2/4) Epoch 19, batch 3650, loss[loss=0.1628, simple_loss=0.2392, pruned_loss=0.04325, over 19252.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2532, pruned_loss=0.05194, over 4690522.82 frames. ], batch size: 42, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 09:00:44,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:45,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 09:00:49,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 09:00:52,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:00:57,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 09:00:59,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 09:01:02,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:02,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:01:02,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=661860.0, ans=0.2 2023-09-30 09:01:03,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:01:05,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:01:05,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:01:07,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 09:01:07,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:01:07,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:09,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 09:01:11,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:01:12,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:12,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:14,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:01:17,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 09:01:19,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 09:01:20,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:01:22,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 09:01:23,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:25,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:01:29,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:01:32,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:32,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:01:34,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:01:34,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:01:35,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:01:40,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:42,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:42,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:44,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:01:46,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:46,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:50,990 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 09:01:55,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:55,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:56,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:01:58,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:00,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:02:00,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:03,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 09:02:03,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:05,938 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.55 vs. limit=15.0 2023-09-30 09:02:06,514 INFO [train.py:1039] (2/4) Epoch 19, batch 3700, loss[loss=0.1692, simple_loss=0.2541, pruned_loss=0.04208, over 24640.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2539, pruned_loss=0.05139, over 4721294.61 frames. ], batch size: 73, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:02:06,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:02:10,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:02:10,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:02:10,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=662126.6666666666, ans=0.2 2023-09-30 09:02:13,343 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:13,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 09:02:13,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:14,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:02:14,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:02:16,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:02:22,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:02:22,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:22,380 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=662193.3333333334, ans=0.125 2023-09-30 09:02:22,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=662193.3333333334, ans=0.125 2023-09-30 09:02:23,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:02:23,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:25,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:02:28,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:28,296 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 09:02:31,268 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.890e+02 2.038e+02 2.335e+02 3.154e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-30 09:02:32,122 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.35 vs. limit=12.0 2023-09-30 09:02:32,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.17 vs. limit=10.0 2023-09-30 09:02:38,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:02:38,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:02:39,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:02:39,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 09:02:39,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:43,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:45,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 09:02:46,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:48,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:02:51,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:51,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:02:55,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:02:58,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:58,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 09:03:00,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:00,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 09:03:05,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:03:06,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:03:08,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=662326.6666666666, ans=0.0 2023-09-30 09:03:09,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:10,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 09:03:11,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:03:11,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:03:11,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:13,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:18,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:18,544 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=662393.3333333334, ans=0.125 2023-09-30 09:03:19,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 09:03:21,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 09:03:22,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:03:22,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:24,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:03:24,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=662393.3333333334, ans=0.125 2023-09-30 09:03:25,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:03:29,193 INFO [train.py:1039] (2/4) Epoch 19, batch 3750, loss[loss=0.1728, simple_loss=0.2627, pruned_loss=0.04144, over 24446.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2549, pruned_loss=0.05182, over 4719724.22 frames. ], batch size: 69, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:03:29,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:03:31,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:03:32,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:03:32,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 09:03:34,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:03:36,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:03:37,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 09:03:39,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:03:40,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:40,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:42,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:03:42,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=662460.0, ans=0.2 2023-09-30 09:03:47,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:51,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:03:52,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:03:52,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=662526.6666666666, ans=0.1 2023-09-30 09:03:55,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:58,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:03:58,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 09:04:00,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:00,438 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:04:01,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:01,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:04:05,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 09:04:09,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 09:04:10,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:10,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=662593.3333333334, ans=0.95 2023-09-30 09:04:11,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:13,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:04:16,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:18,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:04:21,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 09:04:22,977 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:04:25,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:28,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:04:28,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:04:30,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=662660.0, ans=0.2 2023-09-30 09:04:33,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:04:35,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=662726.6666666666, ans=0.125 2023-09-30 09:04:37,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:04:39,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:04:42,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:04:42,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:04:44,709 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.32 vs. limit=10.0 2023-09-30 09:04:45,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:04:45,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=662726.6666666666, ans=0.02 2023-09-30 09:04:51,950 INFO [train.py:1039] (2/4) Epoch 19, batch 3800, loss[loss=0.182, simple_loss=0.2689, pruned_loss=0.04754, over 24677.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2555, pruned_loss=0.0524, over 4710231.13 frames. ], batch size: 73, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:04:55,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:04:55,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=662793.3333333334, ans=0.2 2023-09-30 09:05:01,077 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.92 vs. limit=22.5 2023-09-30 09:05:01,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:01,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:05:03,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 09:05:04,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:04,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:06,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:05:08,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:05:08,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:10,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:05:11,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:13,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:05:13,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:15,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 09:05:18,091 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.822e+02 1.936e+02 2.155e+02 2.834e+02, threshold=3.873e+02, percent-clipped=0.0 2023-09-30 09:05:19,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:05:21,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:05:23,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:24,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:05:26,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:05:28,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:05:28,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:30,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:31,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:36,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:05:36,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 09:05:39,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:05:40,160 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.81 vs. limit=22.5 2023-09-30 09:05:41,439 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=662993.3333333334, ans=0.125 2023-09-30 09:05:42,998 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=662993.3333333334, ans=0.0 2023-09-30 09:05:46,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:05:53,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:05:56,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 09:05:56,700 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=663060.0, ans=0.0 2023-09-30 09:05:57,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 09:05:59,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:59,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=663060.0, ans=0.125 2023-09-30 09:06:00,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:06:00,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:01,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.61 vs. limit=15.0 2023-09-30 09:06:04,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 09:06:07,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 09:06:07,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 09:06:07,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:09,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:06:13,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:06:14,484 INFO [train.py:1039] (2/4) Epoch 19, batch 3850, loss[loss=0.1773, simple_loss=0.2363, pruned_loss=0.05912, over 23470.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2545, pruned_loss=0.05201, over 4705935.00 frames. ], batch size: 285, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:06:14,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:06:19,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:06:21,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 09:06:21,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:06:23,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:26,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:06:26,858 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=663126.6666666666, ans=0.1 2023-09-30 09:06:28,230 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:31,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:06:32,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 09:06:34,626 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=663193.3333333334, ans=0.0 2023-09-30 09:06:39,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:41,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:43,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:44,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:06:46,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:47,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:06:50,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:50,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:06:50,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:50,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=663260.0, ans=0.125 2023-09-30 09:06:51,838 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=663260.0, ans=0.125 2023-09-30 09:06:53,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:53,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:54,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:06:54,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 09:06:54,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 09:06:56,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:56,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:59,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:01,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:02,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 09:07:05,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 09:07:07,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:07,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=663326.6666666666, ans=0.1 2023-09-30 09:07:08,847 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 09:07:12,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:07:12,653 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=663326.6666666666, ans=0.125 2023-09-30 09:07:15,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:17,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:22,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:22,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 09:07:25,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 09:07:27,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:28,195 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.61 vs. limit=15.0 2023-09-30 09:07:29,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:31,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:07:31,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:07:31,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:07:33,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 09:07:34,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:07:36,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 09:07:36,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:36,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:37,755 INFO [train.py:1039] (2/4) Epoch 19, batch 3900, loss[loss=0.1657, simple_loss=0.2317, pruned_loss=0.04982, over 22726.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.253, pruned_loss=0.05162, over 4700787.54 frames. ], batch size: 322, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:07:37,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:07:39,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:40,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:07:41,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=663460.0, ans=0.125 2023-09-30 09:07:42,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:42,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:43,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:07:43,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 09:07:43,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:47,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:49,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:51,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:07:52,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:54,697 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=663526.6666666666, ans=0.0 2023-09-30 09:07:55,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:55,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:57,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:07:59,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 09:07:59,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:02,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 09:08:02,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:08:02,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 09:08:03,990 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.880e+02 2.094e+02 2.304e+02 3.533e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 09:08:04,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 09:08:10,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:10,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:08:12,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:08:12,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:17,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:18,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:08:22,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:08:22,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:23,304 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.71 vs. limit=6.0 2023-09-30 09:08:23,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:08:29,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:29,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:08:36,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:08:37,761 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:08:44,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=663726.6666666666, ans=0.0 2023-09-30 09:08:47,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=663726.6666666666, ans=0.0 2023-09-30 09:08:49,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:08:53,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:53,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 09:08:53,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 09:08:53,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:55,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 09:08:57,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:58,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 09:09:00,551 INFO [train.py:1039] (2/4) Epoch 19, batch 3950, loss[loss=0.1835, simple_loss=0.2607, pruned_loss=0.05318, over 23627.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2524, pruned_loss=0.05062, over 4711367.65 frames. ], batch size: 85, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:09:03,081 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:09:04,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:09:05,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 09:09:06,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:09:08,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:09:09,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:09:15,965 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 09:09:17,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:17,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 09:09:19,447 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 09:09:19,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:09:21,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:22,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:09:22,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:24,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 09:09:27,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:09:27,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:27,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:09:27,948 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=663860.0, ans=0.0 2023-09-30 09:09:29,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:09:29,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:09:39,922 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=663926.6666666666, ans=0.1 2023-09-30 09:09:43,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:09:43,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:09:43,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=663926.6666666666, ans=0.035 2023-09-30 09:09:47,427 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.38 vs. limit=15.0 2023-09-30 09:09:51,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 09:09:58,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 09:09:58,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 09:09:58,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:10:00,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:10:06,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:10:06,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:10:07,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=664060.0, ans=15.0 2023-09-30 09:10:08,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:08,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:10:08,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 09:10:08,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=664060.0, ans=0.2 2023-09-30 09:10:14,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:10:16,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:10:19,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 09:10:24,394 INFO [train.py:1039] (2/4) Epoch 19, batch 4000, loss[loss=0.1965, simple_loss=0.2593, pruned_loss=0.06683, over 23860.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2524, pruned_loss=0.05052, over 4727563.22 frames. ], batch size: 212, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:10:24,726 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=664126.6666666666, ans=0.125 2023-09-30 09:10:31,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:35,009 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.70 vs. limit=22.5 2023-09-30 09:10:36,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=664126.6666666666, ans=0.125 2023-09-30 09:10:38,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:42,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:42,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:10:44,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:44,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 09:10:45,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:10:45,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 09:10:45,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:10:45,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 09:10:48,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:51,342 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.842e+02 2.148e+02 2.341e+02 3.331e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-30 09:10:53,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:10:53,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:10:53,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:10:53,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:53,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:10:56,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:10:57,695 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 09:10:57,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:10:59,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:02,392 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 09:11:03,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:11:03,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:09,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 09:11:09,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:11:11,174 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=664260.0, ans=0.125 2023-09-30 09:11:12,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:11:14,352 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 09:11:14,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.41 vs. limit=15.0 2023-09-30 09:11:15,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:11:17,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 09:11:17,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:11:18,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:18,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:11:20,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:11:20,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:11:20,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:24,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 09:11:24,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:27,576 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 09:11:29,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=664393.3333333334, ans=0.125 2023-09-30 09:11:30,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:11:32,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=664393.3333333334, ans=0.125 2023-09-30 09:11:34,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:11:37,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:11:37,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:37,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=664393.3333333334, ans=0.2 2023-09-30 09:11:38,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:11:40,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:11:45,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:46,664 INFO [train.py:1039] (2/4) Epoch 19, batch 4050, loss[loss=0.1743, simple_loss=0.261, pruned_loss=0.04375, over 24551.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2533, pruned_loss=0.05119, over 4710507.80 frames. ], batch size: 71, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:11:48,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:11:50,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 09:11:51,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:11:51,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:11:53,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:11:55,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:11:56,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:00,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=664460.0, ans=0.025 2023-09-30 09:12:02,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:03,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:05,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:12:06,196 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.26 vs. limit=10.0 2023-09-30 09:12:06,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:12:06,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:12:12,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:14,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:12:17,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 09:12:19,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 09:12:19,418 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 09:12:22,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:12:27,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 09:12:27,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=664593.3333333334, ans=0.0 2023-09-30 09:12:29,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:12:30,539 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=664593.3333333334, ans=15.0 2023-09-30 09:12:32,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:38,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:38,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:12:38,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:42,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:43,449 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.56 vs. limit=10.0 2023-09-30 09:12:44,891 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.01 vs. limit=15.0 2023-09-30 09:12:45,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 09:12:47,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:12:47,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:12:49,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 09:12:49,496 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=664660.0, ans=0.0 2023-09-30 09:12:55,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:13:02,168 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=664726.6666666666, ans=0.0 2023-09-30 09:13:03,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 09:13:05,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:05,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:13:07,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 09:13:07,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 09:13:07,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:08,489 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.42 vs. limit=10.0 2023-09-30 09:13:08,700 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.27 vs. limit=15.0 2023-09-30 09:13:09,185 INFO [train.py:1039] (2/4) Epoch 19, batch 4100, loss[loss=0.1866, simple_loss=0.2489, pruned_loss=0.06218, over 23606.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.254, pruned_loss=0.0517, over 4715905.84 frames. ], batch size: 256, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:13:09,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:11,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:11,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:13:16,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 09:13:16,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 09:13:19,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 09:13:20,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 09:13:20,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:21,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:13:22,570 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 09:13:27,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:29,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:13:29,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:29,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:13:33,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:13:33,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:34,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:13:34,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 09:13:36,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:36,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:13:36,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:36,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:13:38,155 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.904e+02 2.164e+02 2.650e+02 3.755e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-30 09:13:38,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 09:13:41,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:13:43,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 09:13:45,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:48,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:48,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 09:13:48,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:50,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:13:50,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:13:51,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 09:13:51,949 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=664926.6666666666, ans=0.07 2023-09-30 09:13:53,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:13:54,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:13:57,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 09:13:57,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:57,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:01,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:07,774 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:10,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:12,276 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:14:19,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:19,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:22,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:22,211 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=665060.0, ans=0.2 2023-09-30 09:14:23,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:14:28,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:29,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:14:31,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:14:31,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:33,073 INFO [train.py:1039] (2/4) Epoch 19, batch 4150, loss[loss=0.1517, simple_loss=0.2342, pruned_loss=0.03466, over 24362.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2542, pruned_loss=0.0519, over 4721073.27 frames. ], batch size: 61, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:14:34,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 09:14:34,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:36,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 09:14:37,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 09:14:37,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 09:14:39,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:43,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=665126.6666666666, ans=0.125 2023-09-30 09:14:44,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:14:45,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:50,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:14:52,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:14:52,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:14:54,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=665193.3333333334, ans=0.125 2023-09-30 09:14:55,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:14:55,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:56,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:15:01,689 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.044e-02 2023-09-30 09:15:02,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:08,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:08,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 09:15:11,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 09:15:11,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:15:12,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 09:15:12,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:15:12,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:15,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:15,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:16,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=665260.0, ans=0.1 2023-09-30 09:15:22,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 09:15:25,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:28,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:15:28,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 09:15:29,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:31,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 09:15:34,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:15:34,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:35,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:37,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 09:15:37,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:15:38,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:15:39,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:15:41,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 09:15:42,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:42,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:15:42,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:15:44,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 09:15:44,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:45,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:15:45,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:48,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:50,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 09:15:50,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:55,462 INFO [train.py:1039] (2/4) Epoch 19, batch 4200, loss[loss=0.1922, simple_loss=0.2679, pruned_loss=0.0582, over 24326.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.254, pruned_loss=0.05159, over 4728324.54 frames. ], batch size: 77, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:15:55,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:15:55,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 09:15:58,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:16:00,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:02,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:16:02,440 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:02,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:05,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 09:16:08,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 09:16:08,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:11,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:15,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:16:17,241 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=665526.6666666666, ans=0.125 2023-09-30 09:16:18,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:16:19,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:19,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:20,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=665526.6666666666, ans=0.07 2023-09-30 09:16:21,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 09:16:21,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:22,941 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.443e+02 1.959e+02 2.201e+02 2.604e+02 4.093e+02, threshold=4.401e+02, percent-clipped=0.0 2023-09-30 09:16:23,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:23,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:23,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:16:25,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:16:27,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 09:16:27,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:32,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:16:34,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:16:37,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:16:38,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.55 vs. limit=15.0 2023-09-30 09:16:38,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:16:40,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:16:40,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 09:16:40,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:16:42,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:16:48,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:16:49,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:55,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:16:58,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 09:17:01,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:07,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:17:08,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:10,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 09:17:16,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:17:17,658 INFO [train.py:1039] (2/4) Epoch 19, batch 4250, loss[loss=0.1764, simple_loss=0.2464, pruned_loss=0.05321, over 23338.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2529, pruned_loss=0.05157, over 4722253.30 frames. ], batch size: 119, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:17:19,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:17:19,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:17:21,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=665793.3333333334, ans=0.125 2023-09-30 09:17:22,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:27,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:17:29,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 09:17:29,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:17:32,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:36,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:40,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:40,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:43,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.whiten.whitening_limit, batch_count=665860.0, ans=15.0 2023-09-30 09:17:43,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:17:43,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:17:46,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:46,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=665860.0, ans=0.0 2023-09-30 09:17:48,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:49,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:50,075 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=665926.6666666666, ans=0.5 2023-09-30 09:17:51,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:17:51,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=665926.6666666666, ans=0.1 2023-09-30 09:17:52,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:54,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 09:17:58,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 09:17:58,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:58,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:59,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:18:00,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:18:00,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:01,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:18:05,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:18:07,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:18:07,748 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:18:11,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:13,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:13,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 09:18:14,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:18:14,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 09:18:16,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:18:19,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:18:21,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:21,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:18:21,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=665993.3333333334, ans=0.05 2023-09-30 09:18:22,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 09:18:24,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:18:24,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:18:27,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:28,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.19 vs. limit=15.0 2023-09-30 09:18:30,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:33,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:18:35,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:35,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:37,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:18:38,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:18:38,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 09:18:41,237 INFO [train.py:1039] (2/4) Epoch 19, batch 4300, loss[loss=0.1568, simple_loss=0.2349, pruned_loss=0.03933, over 24558.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2517, pruned_loss=0.05118, over 4720829.14 frames. ], batch size: 60, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:18:41,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:45,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:46,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:18:51,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:58,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:58,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 09:18:59,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:19:01,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:19:01,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:19:01,355 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 09:19:04,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:19:07,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:08,727 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.820e+02 2.101e+02 2.491e+02 4.654e+02, threshold=4.202e+02, percent-clipped=1.0 2023-09-30 09:19:09,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 09:19:09,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:19:09,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 09:19:12,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:19:14,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:19:17,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:19:19,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:19:19,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:19:20,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:22,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:19:24,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 09:19:24,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 09:19:27,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:19:30,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:19:30,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:31,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 09:19:31,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 09:19:31,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 09:19:31,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:19:32,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 09:19:32,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 09:19:36,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:38,165 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 09:19:39,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=666326.6666666666, ans=0.2 2023-09-30 09:19:40,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:19:42,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:42,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:46,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 09:19:46,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:46,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:47,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:19:47,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:19:49,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:19:52,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:19:53,028 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=666393.3333333334, ans=0.035 2023-09-30 09:19:55,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:57,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:57,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:20:02,594 INFO [train.py:1039] (2/4) Epoch 19, batch 4350, loss[loss=0.1816, simple_loss=0.2483, pruned_loss=0.05746, over 23333.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2527, pruned_loss=0.05148, over 4715377.37 frames. ], batch size: 119, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:20:02,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 09:20:02,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:20:10,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:12,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.83 vs. limit=6.0 2023-09-30 09:20:13,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:15,869 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.08 vs. limit=15.0 2023-09-30 09:20:16,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:20:16,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:20:20,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:20:24,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:28,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:20:28,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:20:30,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:20:33,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:20:35,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:20:36,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=666593.3333333334, ans=0.125 2023-09-30 09:20:39,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 09:20:41,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:42,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:43,725 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.97 vs. limit=15.0 2023-09-30 09:20:47,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:50,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 09:20:55,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:20:57,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:21:03,369 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 09:21:04,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:04,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:21:06,163 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 09:21:06,290 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 09:21:07,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:07,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:09,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:21:09,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:09,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:09,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:21:13,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 09:21:13,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:13,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:13,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:14,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 09:21:16,293 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 09:21:16,300 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 09:21:16,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 09:21:16,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=666726.6666666666, ans=0.2 2023-09-30 09:21:20,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:21:20,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:21:22,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:23,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:21:23,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 09:21:25,444 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 09:21:25,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:26,824 INFO [train.py:1039] (2/4) Epoch 19, batch 4400, loss[loss=0.1646, simple_loss=0.2512, pruned_loss=0.03894, over 24437.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2537, pruned_loss=0.05165, over 4717599.99 frames. ], batch size: 69, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:21:29,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:30,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:31,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:35,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 09:21:35,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 09:21:37,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 09:21:37,101 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 09:21:37,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:21:37,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:40,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 09:21:43,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:45,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:45,221 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 09:21:48,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=666860.0, ans=0.125 2023-09-30 09:21:49,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:49,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 09:21:49,753 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 09:21:53,197 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=666860.0, ans=0.125 2023-09-30 09:21:53,728 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.65 vs. limit=15.0 2023-09-30 09:21:54,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 09:21:54,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 09:21:54,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 09:21:54,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:54,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=666860.0, ans=0.0 2023-09-30 09:21:55,664 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.851e+02 2.031e+02 2.280e+02 3.356e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 09:21:55,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:00,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 09:22:00,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 09:22:00,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:02,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:22:02,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:03,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:05,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:05,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 09:22:06,777 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 09:22:10,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:11,690 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-09-30 09:22:16,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:17,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 09:22:20,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=666993.3333333334, ans=0.125 2023-09-30 09:22:22,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:22:24,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:27,924 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=666993.3333333334, ans=0.125 2023-09-30 09:22:29,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:22:29,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 09:22:29,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:22:29,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:22:29,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:22:30,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:22:35,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 09:22:37,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 09:22:37,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=667060.0, ans=0.125 2023-09-30 09:22:38,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 09:22:38,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:38,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 09:22:40,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:22:45,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:22:48,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 09:22:49,467 INFO [train.py:1039] (2/4) Epoch 19, batch 4450, loss[loss=0.1791, simple_loss=0.269, pruned_loss=0.04459, over 24532.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2548, pruned_loss=0.05215, over 4708932.48 frames. ], batch size: 71, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:22:51,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:53,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:53,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:23:02,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:02,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:23:05,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:07,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:23:10,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:23:10,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:11,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 09:23:11,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:11,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:11,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:11,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:23:15,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:23:21,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:21,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=667260.0, ans=0.2 2023-09-30 09:23:22,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:24,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:24,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:26,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:23:29,924 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.24 vs. limit=22.5 2023-09-30 09:23:30,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:23:30,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 09:23:31,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 09:23:31,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:23:31,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=667260.0, ans=0.1 2023-09-30 09:23:34,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:35,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 09:23:40,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:23:44,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:44,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 09:23:44,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:44,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:23:44,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:23:46,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:48,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:52,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:23:54,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 09:23:55,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:23:59,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:59,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:24:01,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:01,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:24:02,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:24:04,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=667393.3333333334, ans=0.125 2023-09-30 09:24:06,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 09:24:07,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=667393.3333333334, ans=0.125 2023-09-30 09:24:08,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:24:10,376 INFO [train.py:1039] (2/4) Epoch 19, batch 4500, loss[loss=0.1918, simple_loss=0.2619, pruned_loss=0.06085, over 23393.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2549, pruned_loss=0.05238, over 4704392.73 frames. ], batch size: 105, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:24:12,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:13,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 09:24:13,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 09:24:16,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:21,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:23,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:23,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:24:25,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:24:25,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:25,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:29,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=667526.6666666666, ans=0.0 2023-09-30 09:24:39,994 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.864e+02 2.108e+02 2.361e+02 3.088e+02, threshold=4.216e+02, percent-clipped=0.0 2023-09-30 09:24:40,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:41,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:24:43,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:24:43,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:24:44,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:24:51,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:24:54,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:24:59,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:25:02,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:25:02,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 09:25:04,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:04,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:08,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:08,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:25:11,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:25:11,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 09:25:11,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:25:11,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:17,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:25:17,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:25:20,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:22,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:25:22,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:25:23,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 09:25:26,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 09:25:26,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 09:25:31,106 INFO [train.py:1039] (2/4) Epoch 19, batch 4550, loss[loss=0.1711, simple_loss=0.2584, pruned_loss=0.04188, over 24285.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.254, pruned_loss=0.05165, over 4716111.79 frames. ], batch size: 74, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:25:31,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 09:25:33,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 09:25:35,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:38,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:40,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:42,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:45,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:25:47,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:47,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=667860.0, ans=0.1 2023-09-30 09:25:48,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:25:48,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:25:48,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:51,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:51,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:54,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:25:56,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 09:25:56,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=667860.0, ans=0.125 2023-09-30 09:25:58,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 09:25:58,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:25:59,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 09:26:05,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 09:26:07,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:10,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 09:26:13,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:26:15,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:26:18,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 09:26:21,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:24,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:24,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:26,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:27,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 09:26:27,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 09:26:28,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:26:29,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 09:26:32,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 09:26:32,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:34,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:35,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:35,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:35,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:26:38,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:26:38,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 09:26:40,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:40,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:26:42,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 09:26:42,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:26:42,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 09:26:45,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:26:45,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:26:47,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:26:47,320 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:26:48,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:49,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:26:50,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:26:53,375 INFO [train.py:1039] (2/4) Epoch 19, batch 4600, loss[loss=0.1634, simple_loss=0.2443, pruned_loss=0.04121, over 24464.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2537, pruned_loss=0.05132, over 4718639.29 frames. ], batch size: 63, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:26:53,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:26:55,933 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.26 vs. limit=15.0 2023-09-30 09:26:56,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:56,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:58,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:26:58,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:26:59,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:01,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 09:27:03,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:27:06,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=668126.6666666666, ans=0.125 2023-09-30 09:27:07,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:27:09,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:10,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:10,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=668193.3333333334, ans=0.125 2023-09-30 09:27:18,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 09:27:18,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:22,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:25,422 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.849e+02 2.100e+02 2.657e+02 4.568e+02, threshold=4.200e+02, percent-clipped=2.0 2023-09-30 09:27:27,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:27:27,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:32,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 09:27:32,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:27:33,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:27:38,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:38,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:27:40,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:27:45,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 09:27:47,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:27:51,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=668326.6666666666, ans=0.1 2023-09-30 09:27:52,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:53,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:27:55,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:55,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 09:27:56,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:58,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 09:27:58,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:58,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=668393.3333333334, ans=0.05 2023-09-30 09:28:00,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:01,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:01,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:01,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:03,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 09:28:03,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=668393.3333333334, ans=0.125 2023-09-30 09:28:04,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 09:28:04,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 09:28:04,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:06,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:07,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:08,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:16,063 INFO [train.py:1039] (2/4) Epoch 19, batch 4650, loss[loss=0.1952, simple_loss=0.2627, pruned_loss=0.06387, over 23878.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.253, pruned_loss=0.05116, over 4716647.59 frames. ], batch size: 212, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:28:19,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:28:19,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=668460.0, ans=0.07 2023-09-30 09:28:21,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=668460.0, ans=0.125 2023-09-30 09:28:22,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:22,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:24,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:28:24,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:24,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:24,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:29,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 09:28:31,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:28:34,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 09:28:34,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:35,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=668526.6666666666, ans=0.1 2023-09-30 09:28:36,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 09:28:36,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:28:36,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 09:28:37,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 09:28:37,764 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:39,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:28:42,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:28:43,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:43,912 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 09:28:44,324 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=668526.6666666666, ans=0.0 2023-09-30 09:28:46,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:48,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 09:28:52,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:52,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:28:53,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 09:28:55,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:57,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=668593.3333333334, ans=0.125 2023-09-30 09:28:58,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:29:02,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:07,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:10,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:10,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:11,451 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.61 vs. limit=6.0 2023-09-30 09:29:12,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:29:15,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 09:29:16,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 09:29:16,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 09:29:16,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 09:29:18,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:20,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=668726.6666666666, ans=0.125 2023-09-30 09:29:21,868 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=668726.6666666666, ans=0.125 2023-09-30 09:29:25,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:29:25,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:26,476 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 09:29:26,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:26,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:26,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:29:30,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:29:31,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:29:31,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:34,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:38,513 INFO [train.py:1039] (2/4) Epoch 19, batch 4700, loss[loss=0.2021, simple_loss=0.2677, pruned_loss=0.0682, over 23796.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2539, pruned_loss=0.05134, over 4721817.92 frames. ], batch size: 195, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:29:38,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:38,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:29:38,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:29:38,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 09:29:40,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:29:41,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 09:29:51,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:51,411 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:52,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:29:52,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:55,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:29:59,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=668860.0, ans=0.0 2023-09-30 09:30:00,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 09:30:02,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 09:30:03,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:05,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:30:05,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:30:09,361 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.848e+02 1.982e+02 2.168e+02 3.287e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 09:30:11,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:16,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:30:17,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:30:21,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:30:23,200 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.46 vs. limit=12.0 2023-09-30 09:30:27,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 09:30:28,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:30:31,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:32,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=668993.3333333334, ans=0.1 2023-09-30 09:30:35,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 09:30:37,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:30:41,014 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:30:41,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=668993.3333333334, ans=0.125 2023-09-30 09:30:42,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 09:30:44,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:44,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:47,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:48,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:30:49,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 09:30:49,123 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 09:30:50,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:52,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 09:30:52,902 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.84 vs. limit=15.0 2023-09-30 09:30:53,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:58,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 09:31:00,089 INFO [train.py:1039] (2/4) Epoch 19, batch 4750, loss[loss=0.1816, simple_loss=0.2698, pruned_loss=0.04668, over 24325.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2547, pruned_loss=0.05142, over 4715920.29 frames. ], batch size: 74, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:31:01,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:31:03,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:31:09,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 09:31:10,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:15,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 09:31:15,995 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.01 vs. limit=22.5 2023-09-30 09:31:16,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:31:16,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:31:16,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:24,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 09:31:28,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:31:31,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 09:31:31,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:34,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:34,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:34,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:36,508 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 09:31:36,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 09:31:36,854 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=669260.0, ans=0.2 2023-09-30 09:31:43,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 09:31:46,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:49,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:31:51,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:31:51,530 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 09:31:51,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:31:53,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:31:56,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:31:58,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 09:31:58,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 09:31:58,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:58,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:31:58,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:59,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=15.0 2023-09-30 09:32:00,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:32:01,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 09:32:04,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 09:32:09,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:11,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:32:11,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 09:32:11,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:13,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:16,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:32:16,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:18,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:32:21,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:21,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 09:32:21,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 09:32:23,297 INFO [train.py:1039] (2/4) Epoch 19, batch 4800, loss[loss=0.1971, simple_loss=0.2638, pruned_loss=0.06519, over 23503.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2553, pruned_loss=0.05185, over 4716562.86 frames. ], batch size: 149, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:32:23,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 09:32:26,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:32:26,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:28,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 09:32:30,311 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.96 vs. limit=15.0 2023-09-30 09:32:32,497 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.91 vs. limit=12.0 2023-09-30 09:32:34,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:36,311 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:40,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:32:42,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:42,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:42,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 09:32:44,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:44,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:32:44,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:32:51,295 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:32:52,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:52,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:32:54,165 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.880e+02 2.098e+02 2.403e+02 3.356e+02, threshold=4.196e+02, percent-clipped=0.0 2023-09-30 09:32:54,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:55,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:32:55,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:56,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:58,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:59,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:33:04,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:33:06,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:07,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 09:33:07,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 09:33:09,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:09,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:33:09,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:33:09,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:09,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:33:12,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:33:13,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:17,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:20,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:20,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=669660.0, ans=0.125 2023-09-30 09:33:23,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:28,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 09:33:28,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:30,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:30,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:33:30,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:34,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:35,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:33:35,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:35,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:33:37,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:33:37,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:33:42,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:43,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:43,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:44,306 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=669793.3333333334, ans=0.125 2023-09-30 09:33:45,378 INFO [train.py:1039] (2/4) Epoch 19, batch 4850, loss[loss=0.1627, simple_loss=0.2474, pruned_loss=0.03901, over 24470.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2551, pruned_loss=0.05166, over 4722134.38 frames. ], batch size: 63, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:33:45,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 09:33:47,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 09:33:47,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:33:47,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:49,062 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=669793.3333333334, ans=0.0 2023-09-30 09:33:50,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:34:00,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 09:34:00,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:05,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:07,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:34:07,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:11,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:13,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:34:14,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:34:14,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 09:34:19,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:34:20,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:34:20,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:34:22,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:34:22,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 09:34:25,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:25,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:28,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:29,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 09:34:30,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 09:34:31,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:34:39,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:34:39,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 09:34:42,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:34:42,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:34:44,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:34:46,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 09:34:46,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:47,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 09:34:47,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:48,566 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.77 vs. limit=10.0 2023-09-30 09:34:49,214 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:34:49,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 09:34:58,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:58,732 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=670060.0, ans=0.125 2023-09-30 09:35:05,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:35:05,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:08,132 INFO [train.py:1039] (2/4) Epoch 19, batch 4900, loss[loss=0.1726, simple_loss=0.2423, pruned_loss=0.05148, over 23484.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2547, pruned_loss=0.05197, over 4701012.04 frames. ], batch size: 134, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:35:11,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 09:35:11,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:35:16,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:18,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:18,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:35:21,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 09:35:24,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 09:35:29,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 09:35:31,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 09:35:31,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:31,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:31,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:35:31,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:31,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:35:33,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 09:35:33,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=670193.3333333334, ans=0.125 2023-09-30 09:35:38,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 09:35:38,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:35:39,979 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 2.060e+02 2.332e+02 2.793e+02 4.381e+02, threshold=4.664e+02, percent-clipped=2.0 2023-09-30 09:35:41,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:35:41,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:43,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:35:45,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:46,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:35:46,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 09:35:49,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:35:49,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:49,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 09:35:49,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 09:35:55,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 09:35:56,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:35:58,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:35:58,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:35:59,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:59,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:35:59,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:36:01,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 09:36:03,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:04,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:36:04,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:36:07,168 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.08 vs. limit=15.0 2023-09-30 09:36:07,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 09:36:09,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:36:11,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 09:36:11,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 09:36:20,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:21,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:36:21,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=670393.3333333334, ans=0.125 2023-09-30 09:36:23,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 09:36:23,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:23,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:36:25,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:29,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:29,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:36:31,098 INFO [train.py:1039] (2/4) Epoch 19, batch 4950, loss[loss=0.1474, simple_loss=0.1972, pruned_loss=0.04879, over 18866.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2531, pruned_loss=0.05179, over 4704489.52 frames. ], batch size: 388, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:36:31,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:31,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:36:31,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=670460.0, ans=0.0 2023-09-30 09:36:32,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:36:34,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:35,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:37,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 09:36:37,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 09:36:37,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:36:39,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 09:36:39,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:39,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:36:39,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:36:39,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:36:39,508 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=670460.0, ans=0.125 2023-09-30 09:36:40,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:42,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:36:44,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:36:46,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:50,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:50,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:54,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:36:58,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:59,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:37:01,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:02,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:04,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:37:05,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 09:37:07,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 09:37:10,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:11,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:37:12,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:37:13,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:37:13,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:37:15,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:37:15,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=670593.3333333334, ans=0.0 2023-09-30 09:37:16,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:18,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:37:20,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:37:22,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:22,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:24,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 09:37:24,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:37:26,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:37:30,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:37:32,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:37:32,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:37:34,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:34,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:37:34,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:37:35,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:37:37,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:37:37,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:39,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 09:37:39,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=670726.6666666666, ans=0.125 2023-09-30 09:37:43,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:37:49,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 09:37:49,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:37:53,429 INFO [train.py:1039] (2/4) Epoch 19, batch 5000, loss[loss=0.155, simple_loss=0.2274, pruned_loss=0.04131, over 24420.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2526, pruned_loss=0.05173, over 4690098.08 frames. ], batch size: 58, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:37:57,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:57,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:37:58,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 09:38:00,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 09:38:02,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:04,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 09:38:04,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:38:04,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:38:05,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 09:38:07,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:08,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:08,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 09:38:08,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:10,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:10,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 09:38:11,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 09:38:12,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:38:13,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 09:38:13,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:38:14,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:15,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:38:15,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 09:38:15,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 09:38:18,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 09:38:18,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:18,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:19,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 09:38:19,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:38:21,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=670860.0, ans=0.0 2023-09-30 09:38:22,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:22,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:24,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:38:26,582 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.806e+02 2.052e+02 2.286e+02 3.432e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 09:38:26,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 09:38:28,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:38:29,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:38:33,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=670926.6666666666, ans=0.125 2023-09-30 09:38:34,771 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 09:38:38,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:38,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:38,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:38:43,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 09:38:43,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:43,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:43,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:38:46,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 09:38:46,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:49,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:51,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:55,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 09:38:59,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:05,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=671060.0, ans=0.125 2023-09-30 09:39:07,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:39:10,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:10,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:39:10,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:10,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:39:10,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:39:11,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:14,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:16,094 INFO [train.py:1039] (2/4) Epoch 19, batch 5050, loss[loss=0.1864, simple_loss=0.2712, pruned_loss=0.05085, over 23946.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2535, pruned_loss=0.05176, over 4704027.75 frames. ], batch size: 80, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:39:16,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 09:39:16,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:39:17,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:19,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:39:19,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 09:39:20,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:20,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:39:22,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:39:24,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:39:25,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:39:37,487 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.13 vs. limit=22.5 2023-09-30 09:39:38,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 09:39:39,212 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.00 vs. limit=15.0 2023-09-30 09:39:39,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:39:39,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:39:41,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 09:39:41,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:39:44,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:44,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:46,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:39:46,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 09:39:46,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=671193.3333333334, ans=0.125 2023-09-30 09:39:47,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 09:39:47,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:49,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=671260.0, ans=0.0 2023-09-30 09:39:50,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:39:51,375 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=671260.0, ans=0.125 2023-09-30 09:39:54,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:54,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 09:39:55,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:40:00,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 09:40:01,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:40:01,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:40:01,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=671260.0, ans=0.5 2023-09-30 09:40:03,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:03,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:40:03,970 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.02 vs. limit=15.0 2023-09-30 09:40:06,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:07,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:40:09,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:09,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:40:09,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:40:09,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 09:40:11,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:40:13,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:40:19,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:40:19,158 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 09:40:19,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:40:20,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:22,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:22,141 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 09:40:23,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:23,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 09:40:23,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:28,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:28,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:28,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 09:40:30,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 09:40:31,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:31,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:40:33,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:40:34,120 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.86 vs. limit=15.0 2023-09-30 09:40:36,313 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 09:40:37,728 INFO [train.py:1039] (2/4) Epoch 19, batch 5100, loss[loss=0.1769, simple_loss=0.2624, pruned_loss=0.04571, over 24355.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2547, pruned_loss=0.05215, over 4705601.38 frames. ], batch size: 77, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:40:37,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:40,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 09:40:41,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 09:40:43,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:45,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:48,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:48,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 09:40:50,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 09:40:54,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:54,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:40:54,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=671526.6666666666, ans=0.2 2023-09-30 09:40:58,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:41:01,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 09:41:03,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:04,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:41:04,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:41:05,393 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.84 vs. limit=15.0 2023-09-30 09:41:07,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 09:41:09,887 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=671593.3333333334, ans=0.125 2023-09-30 09:41:10,884 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.887e+02 2.088e+02 2.426e+02 5.296e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-30 09:41:11,037 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 09:41:11,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:12,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 09:41:12,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 09:41:17,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:26,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:41:30,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 09:41:30,210 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 09:41:30,223 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 09:41:33,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 09:41:33,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:34,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 09:41:36,696 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=671660.0, ans=0.125 2023-09-30 09:41:36,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=671660.0, ans=0.125 2023-09-30 09:41:39,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 09:41:40,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:41:42,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:41:45,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 09:41:47,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:41:47,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 09:41:49,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=671726.6666666666, ans=0.015 2023-09-30 09:41:53,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:41:54,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:41:54,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:41:54,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:41:54,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:41:55,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:41:57,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 09:41:57,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 09:41:57,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 09:41:59,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:41:59,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 09:42:00,981 INFO [train.py:1039] (2/4) Epoch 19, batch 5150, loss[loss=0.1832, simple_loss=0.2635, pruned_loss=0.05145, over 24414.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2556, pruned_loss=0.05264, over 4715539.32 frames. ], batch size: 77, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:42:01,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:01,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:42:03,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:04,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:10,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:42:10,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 09:42:10,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:12,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:42:14,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:42:14,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:14,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:15,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:42:15,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:42:17,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 09:42:18,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:42:18,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:42:21,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:42:25,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 09:42:25,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:42:29,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:42:34,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 09:42:39,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:43,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:45,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:51,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:42:53,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:42:54,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 09:42:59,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:59,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:42:59,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:43:01,407 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=671993.3333333334, ans=0.0 2023-09-30 09:43:03,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:03,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:43:04,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 09:43:10,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:12,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:43:15,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:43:15,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:43:15,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:43:16,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:43:16,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:43:17,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:43:20,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:43:21,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:43:23,309 INFO [train.py:1039] (2/4) Epoch 19, batch 5200, loss[loss=0.1835, simple_loss=0.2662, pruned_loss=0.05042, over 24015.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2548, pruned_loss=0.05186, over 4713691.51 frames. ], batch size: 80, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:43:23,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:23,777 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=672126.6666666666, ans=0.0 2023-09-30 09:43:29,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 09:43:30,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:43:30,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:33,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:36,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:43:36,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.17 vs. limit=15.0 2023-09-30 09:43:37,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:39,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 09:43:42,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:43:44,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:48,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 09:43:49,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:43:51,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:43:51,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 09:43:52,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 09:43:54,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 09:43:56,149 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.839e+02 2.019e+02 2.213e+02 3.440e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 09:43:56,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:56,310 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 09:43:56,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:57,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:58,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:43:59,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 09:44:00,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:02,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:05,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 09:44:05,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 09:44:07,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 09:44:12,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 09:44:14,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:44:21,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:44:21,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:23,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 09:44:23,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:23,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 09:44:23,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:23,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:44:28,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:28,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:44:30,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672393.3333333334, ans=0.1 2023-09-30 09:44:31,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=672393.3333333334, ans=0.2 2023-09-30 09:44:33,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:44:33,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:33,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:38,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:39,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 09:44:39,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:41,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:44:41,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:41,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=672393.3333333334, ans=0.125 2023-09-30 09:44:42,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:44:42,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:44:45,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:46,780 INFO [train.py:1039] (2/4) Epoch 19, batch 5250, loss[loss=0.1582, simple_loss=0.2071, pruned_loss=0.0546, over 19025.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2541, pruned_loss=0.05215, over 4699193.48 frames. ], batch size: 388, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:44:48,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:48,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:44:50,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:44:56,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:56,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:44:58,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:45:01,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:45:02,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 09:45:02,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:45:04,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:45:33,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=672660.0, ans=0.125 2023-09-30 09:45:40,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=672660.0, ans=0.125 2023-09-30 09:46:02,585 INFO [train.py:1039] (2/4) Epoch 19, batch 5300, loss[loss=0.1838, simple_loss=0.2528, pruned_loss=0.05742, over 23322.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2535, pruned_loss=0.05203, over 4705923.30 frames. ], batch size: 119, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:46:08,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=672793.3333333334, ans=0.1 2023-09-30 09:46:16,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:46:16,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 09:46:16,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 09:46:16,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:16,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:16,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:16,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:16,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:16,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:17,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:17,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:46:17,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:46:17,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 09:46:17,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 09:46:17,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 09:46:18,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:46:18,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 09:46:18,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 09:46:18,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:19,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:19,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:19,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:19,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:46:20,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:20,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:20,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:20,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:20,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:20,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:46:20,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:20,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:46:21,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 09:46:21,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:22,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:22,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 09:46:22,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 09:46:22,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:46:22,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:22,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 09:46:23,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 09:46:23,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:23,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:46:24,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:24,309 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 09:46:24,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 09:46:24,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:46:24,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:24,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 09:46:24,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 09:46:24,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 09:46:25,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:34,338 INFO [train.py:1039] (2/4) Epoch 20, batch 0, loss[loss=0.1648, simple_loss=0.2455, pruned_loss=0.04204, over 24303.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2455, pruned_loss=0.04204, over 24303.00 frames. ], batch size: 61, lr: 5.21e-03, grad_scale: 32.0 2023-09-30 09:46:34,339 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 09:46:47,938 INFO [train.py:1071] (2/4) Epoch 20, validation: loss=0.2867, simple_loss=0.2695, pruned_loss=0.152, over 1125622.00 frames. 2023-09-30 09:46:47,940 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 09:46:49,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 09:46:49,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:46:52,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:46:55,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:57,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:46:57,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:57,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=672866.6666666666, ans=0.0 2023-09-30 09:46:58,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 09:47:01,567 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.839e+02 2.043e+02 2.275e+02 5.407e+02, threshold=4.087e+02, percent-clipped=3.0 2023-09-30 09:47:01,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 09:47:05,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:05,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:05,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=672933.3333333334, ans=0.2 2023-09-30 09:47:08,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:10,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:10,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:47:10,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:13,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 09:47:15,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:24,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:47:24,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:25,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 09:47:26,526 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.66 vs. limit=15.0 2023-09-30 09:47:30,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:47:30,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:47:31,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:35,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:47:39,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:42,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=673066.6666666666, ans=0.035 2023-09-30 09:47:46,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 09:47:50,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 09:47:50,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:47:50,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:47:51,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:47:53,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:56,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 09:47:56,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=673133.3333333334, ans=0.125 2023-09-30 09:47:58,473 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=673133.3333333334, ans=0.1 2023-09-30 09:47:59,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:48:01,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:48:04,513 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:08,951 INFO [train.py:1039] (2/4) Epoch 20, batch 50, loss[loss=0.1857, simple_loss=0.272, pruned_loss=0.04967, over 24397.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2587, pruned_loss=0.0517, over 1061845.20 frames. ], batch size: 77, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:48:09,051 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 09:48:10,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:48:13,097 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.28 vs. limit=15.0 2023-09-30 09:48:13,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:16,210 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=15.0 2023-09-30 09:48:16,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:48:16,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 09:48:17,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:48:17,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:48:19,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:23,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:23,262 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=673200.0, ans=0.125 2023-09-30 09:48:24,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:29,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 09:48:29,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:36,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:48:38,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 09:48:39,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 09:48:41,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:48:43,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:48:43,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:44,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:48:45,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:48:45,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:48:45,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:53,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:48:56,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:57,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:48:57,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 09:48:57,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=673400.0, ans=0.0 2023-09-30 09:49:00,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:49:00,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:49:00,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 09:49:00,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:03,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 09:49:10,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:10,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:49:12,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:12,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:12,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:15,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 09:49:15,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 09:49:17,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:17,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:18,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:49:18,797 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=673466.6666666666, ans=0.125 2023-09-30 09:49:19,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:20,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 09:49:21,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 09:49:21,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:49:23,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:23,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:49:25,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 09:49:25,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 09:49:27,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:28,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:29,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:49:29,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:49:31,409 INFO [train.py:1039] (2/4) Epoch 20, batch 100, loss[loss=0.1775, simple_loss=0.2631, pruned_loss=0.046, over 24312.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2553, pruned_loss=0.05033, over 1881893.41 frames. ], batch size: 74, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:49:33,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:49:35,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:49:37,308 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=673533.3333333334, ans=0.0 2023-09-30 09:49:40,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:42,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 09:49:42,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:43,001 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.75 vs. limit=15.0 2023-09-30 09:49:46,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:49:46,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:48,164 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.857e+02 2.032e+02 2.240e+02 3.945e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 09:49:48,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:48,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:48,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:48,860 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.00 vs. limit=15.0 2023-09-30 09:49:49,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 09:49:51,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:49:51,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:53,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:53,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:56,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=673600.0, ans=0.125 2023-09-30 09:49:57,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 09:49:59,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:00,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:02,668 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:50:04,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=673666.6666666666, ans=0.0 2023-09-30 09:50:05,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:50:08,833 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 09:50:08,877 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 09:50:11,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:11,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:50:13,127 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.30 vs. limit=15.0 2023-09-30 09:50:14,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:50:15,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:19,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:24,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:25,752 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 09:50:27,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:50:30,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:50:31,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:50:32,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=673733.3333333334, ans=0.125 2023-09-30 09:50:34,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:36,699 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=673800.0, ans=0.0 2023-09-30 09:50:37,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:40,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=673800.0, ans=0.09899494936611666 2023-09-30 09:50:41,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:50:43,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:50:46,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:48,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:49,086 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-09-30 09:50:49,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:49,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:50:49,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:51,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 09:50:51,194 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 09:50:51,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:52,546 INFO [train.py:1039] (2/4) Epoch 20, batch 150, loss[loss=0.1663, simple_loss=0.2531, pruned_loss=0.03978, over 24678.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2551, pruned_loss=0.05042, over 2527123.21 frames. ], batch size: 65, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:50:52,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:50:54,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:54,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:54,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:50:54,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:50:54,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:50:56,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:56,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:58,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:59,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:50:59,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:51:02,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:03,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=673866.6666666666, ans=0.125 2023-09-30 09:51:03,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=673866.6666666666, ans=0.09899494936611666 2023-09-30 09:51:04,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:51:04,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:04,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=673866.6666666666, ans=0.1 2023-09-30 09:51:06,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:09,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:09,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:12,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:51:13,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:17,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 09:51:17,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 09:51:17,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 09:51:22,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:51:22,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:51:22,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:51:24,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:51:24,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:25,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:25,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:29,381 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 09:51:29,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=674000.0, ans=0.125 2023-09-30 09:51:30,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:34,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=674000.0, ans=0.1 2023-09-30 09:51:37,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:42,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:51:42,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 09:51:46,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:51:46,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:46,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:51:48,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=674066.6666666666, ans=10.0 2023-09-30 09:51:50,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:51:52,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:52,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:51:53,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:55,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 09:51:59,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:01,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:01,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:52:01,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:52:04,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:07,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 09:52:09,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:52:10,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:52:10,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:13,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:52:13,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 09:52:13,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:52:15,045 INFO [train.py:1039] (2/4) Epoch 20, batch 200, loss[loss=0.2174, simple_loss=0.2783, pruned_loss=0.07819, over 22653.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2565, pruned_loss=0.05163, over 3013808.75 frames. ], batch size: 322, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:52:15,143 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 09:52:18,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:21,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:52:21,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:52:24,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 09:52:26,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:26,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:28,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 09:52:30,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:52:31,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:33,171 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.908e+02 2.091e+02 2.356e+02 3.035e+02, threshold=4.181e+02, percent-clipped=0.0 2023-09-30 09:52:34,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:37,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:52:37,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:37,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:50,457 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.76 vs. limit=22.5 2023-09-30 09:52:58,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:52:58,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:53:00,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:53:01,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:02,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:53:02,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:53:04,892 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.49 vs. limit=10.0 2023-09-30 09:53:05,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:05,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:53:07,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:07,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:10,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 09:53:10,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:53:10,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:12,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=674400.0, ans=0.1 2023-09-30 09:53:13,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:53:21,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:22,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=674466.6666666666, ans=0.0 2023-09-30 09:53:27,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:29,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:53:29,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=674466.6666666666, ans=0.125 2023-09-30 09:53:34,262 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:34,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=674533.3333333334, ans=0.125 2023-09-30 09:53:36,189 INFO [train.py:1039] (2/4) Epoch 20, batch 250, loss[loss=0.1709, simple_loss=0.2555, pruned_loss=0.04311, over 24467.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2562, pruned_loss=0.05209, over 3389364.05 frames. ], batch size: 63, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:53:37,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 09:53:37,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:37,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:53:37,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:39,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:53:41,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 09:53:42,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:53:42,831 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 09:53:46,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:48,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:53:49,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:49,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:51,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:53:51,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:54,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:57,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:53:58,842 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.02 vs. limit=15.0 2023-09-30 09:54:09,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:10,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:54:10,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:54:16,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:54:18,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:54:19,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:54:19,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:21,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:54:21,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:54:21,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:23,186 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=674666.6666666666, ans=0.1 2023-09-30 09:54:24,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:54:28,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 09:54:28,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:29,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:54:29,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:54:29,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:54:29,885 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=674733.3333333334, ans=0.1 2023-09-30 09:54:31,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:54:31,595 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=674733.3333333334, ans=0.125 2023-09-30 09:54:32,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:54:32,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:54:34,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:36,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=674733.3333333334, ans=0.0 2023-09-30 09:54:37,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:54:37,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:41,283 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:54:45,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:50,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:54:54,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:56,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:54:57,193 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=15.0 2023-09-30 09:54:59,549 INFO [train.py:1039] (2/4) Epoch 20, batch 300, loss[loss=0.159, simple_loss=0.2145, pruned_loss=0.05169, over 19598.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2544, pruned_loss=0.05137, over 3676579.21 frames. ], batch size: 388, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:54:59,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 09:55:01,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:01,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:55:01,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 09:55:03,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:55:03,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:55:04,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 09:55:09,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:55:10,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:13,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:55:14,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 09:55:16,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:55:16,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:55:17,752 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.911e+02 2.106e+02 2.458e+02 4.276e+02, threshold=4.211e+02, percent-clipped=1.0 2023-09-30 09:55:17,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 09:55:17,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:21,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:55:24,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:55:25,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 09:55:29,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 09:55:29,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:32,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:36,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:36,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 09:55:36,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:55:39,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:55:40,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:55:40,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:46,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:55:46,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 09:55:49,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:55:52,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:53,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 09:55:55,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:59,427 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:01,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:56:01,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 09:56:07,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:07,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:56:08,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:11,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:56:12,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 09:56:12,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:56:12,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:14,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 09:56:14,415 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=675133.3333333334, ans=0.125 2023-09-30 09:56:17,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:17,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:19,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:19,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:20,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:22,138 INFO [train.py:1039] (2/4) Epoch 20, batch 350, loss[loss=0.1724, simple_loss=0.2629, pruned_loss=0.04097, over 24357.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2521, pruned_loss=0.05059, over 3894170.33 frames. ], batch size: 74, lr: 5.20e-03, grad_scale: 4.0 2023-09-30 09:56:24,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:24,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:56:27,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:34,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:37,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:38,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:41,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 09:56:43,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:43,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=675266.6666666666, ans=0.0 2023-09-30 09:56:44,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 09:56:47,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:48,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 09:56:48,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:51,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 09:56:53,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:56:55,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:56,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:58,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:56:58,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:58,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:57:01,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:01,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:09,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:09,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:57:09,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:57:11,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:15,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 09:57:15,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:22,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:22,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:22,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:57:23,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 09:57:25,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:27,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 09:57:27,523 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.19 vs. limit=10.0 2023-09-30 09:57:28,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 09:57:28,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:28,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=675466.6666666666, ans=0.025 2023-09-30 09:57:31,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:57:31,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 09:57:35,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:36,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:57:38,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:40,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:40,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:42,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:43,400 INFO [train.py:1039] (2/4) Epoch 20, batch 400, loss[loss=0.1934, simple_loss=0.2618, pruned_loss=0.06251, over 23887.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.252, pruned_loss=0.05036, over 4077295.73 frames. ], batch size: 179, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:57:43,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:45,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:57:47,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 09:57:48,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:48,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:50,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:57:51,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:53,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:55,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:55,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=675533.3333333334, ans=0.125 2023-09-30 09:57:56,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 09:57:57,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 09:57:57,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:58:00,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 09:58:00,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:03,699 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.898e+02 2.066e+02 2.335e+02 3.981e+02, threshold=4.133e+02, percent-clipped=0.0 2023-09-30 09:58:04,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:58:04,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:04,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 09:58:05,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:58:05,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:05,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:07,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:58:09,338 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 09:58:10,199 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.53 vs. limit=10.0 2023-09-30 09:58:10,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 09:58:15,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:58:18,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:18,710 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=675666.6666666666, ans=0.1 2023-09-30 09:58:19,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 09:58:20,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 09:58:23,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:58:23,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=675666.6666666666, ans=0.0 2023-09-30 09:58:23,438 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=675666.6666666666, ans=0.0 2023-09-30 09:58:23,440 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=675666.6666666666, ans=0.125 2023-09-30 09:58:24,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:32,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 09:58:34,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:58:36,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 09:58:38,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:40,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=675733.3333333334, ans=0.05 2023-09-30 09:58:41,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:58:41,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 09:58:45,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:58:47,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:58:48,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:51,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:51,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 09:58:53,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:58:53,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=675800.0, ans=0.0 2023-09-30 09:58:54,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 09:58:58,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:58:58,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:59:00,705 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.44 vs. limit=22.5 2023-09-30 09:59:01,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 09:59:02,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:59:03,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:59:04,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:59:04,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 09:59:04,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:59:06,082 INFO [train.py:1039] (2/4) Epoch 20, batch 450, loss[loss=0.1678, simple_loss=0.2421, pruned_loss=0.04669, over 23465.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2523, pruned_loss=0.05046, over 4226643.90 frames. ], batch size: 134, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:59:06,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:59:07,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:59:07,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 09:59:09,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:59:10,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:59:13,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:59:23,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=675933.3333333334, ans=0.125 2023-09-30 09:59:25,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:25,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:59:25,900 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.47 vs. limit=15.0 2023-09-30 09:59:26,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 09:59:27,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=675933.3333333334, ans=0.125 2023-09-30 09:59:28,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 09:59:32,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:59:33,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:34,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:37,030 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=675933.3333333334, ans=0.0 2023-09-30 09:59:40,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:41,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:43,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=676000.0, ans=0.2 2023-09-30 09:59:44,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 09:59:44,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 09:59:47,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 09:59:49,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:59:49,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:50,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=676000.0, ans=0.2 2023-09-30 09:59:52,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:59:53,441 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 09:59:53,455 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 09:59:53,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:57,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:59:58,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:00:01,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:00:01,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:00:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:00:05,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 10:00:06,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:09,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:00:09,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:00:12,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 10:00:16,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:00:17,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 10:00:19,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 10:00:19,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:19,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=676133.3333333334, ans=0.02 2023-09-30 10:00:26,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:00:27,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:29,606 INFO [train.py:1039] (2/4) Epoch 20, batch 500, loss[loss=0.2509, simple_loss=0.3102, pruned_loss=0.09584, over 19403.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2525, pruned_loss=0.05088, over 4317333.16 frames. ], batch size: 388, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:00:29,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:00:29,746 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 10:00:32,359 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=676200.0, ans=0.125 2023-09-30 10:00:33,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:35,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:00:35,121 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:35,145 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 10:00:37,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 10:00:38,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:41,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:00:47,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:00:48,770 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.813e+02 1.975e+02 2.252e+02 5.149e+02, threshold=3.950e+02, percent-clipped=1.0 2023-09-30 10:00:48,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:00:49,851 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.31 vs. limit=10.0 2023-09-30 10:00:51,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:51,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:52,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:05,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:01:05,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:01:05,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 10:01:05,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:01:10,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:01:10,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:01:10,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:01:10,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:12,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 10:01:15,461 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 10:01:17,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:18,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:20,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:01:23,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 10:01:26,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:01:28,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:33,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:34,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:42,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:46,855 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.48 vs. limit=15.0 2023-09-30 10:01:47,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 10:01:47,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:47,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:49,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 10:01:50,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:01:50,867 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=676533.3333333334, ans=0.125 2023-09-30 10:01:51,428 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=15.0 2023-09-30 10:01:52,010 INFO [train.py:1039] (2/4) Epoch 20, batch 550, loss[loss=0.1952, simple_loss=0.2757, pruned_loss=0.05735, over 24063.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2545, pruned_loss=0.05163, over 4409510.34 frames. ], batch size: 80, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:01:52,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:58,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 10:01:59,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 10:01:59,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:59,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 10:02:01,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:02:01,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:02,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:02:06,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:02:07,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:02:09,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 10:02:09,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:02:13,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:13,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:17,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:17,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:23,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 10:02:23,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 10:02:24,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:02:24,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=676666.6666666666, ans=0.02 2023-09-30 10:02:30,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:02:30,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:32,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:02:36,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:36,843 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 10:02:36,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:39,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:02:42,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:42,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:02:42,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:02:44,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:45,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 10:02:48,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 10:02:49,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:02:49,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:49,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:02:49,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:52,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:02:54,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:02:57,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:02:58,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:59,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 10:02:59,337 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:03:00,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:03:00,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:02,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:03:02,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:03,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:03:03,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:03:09,022 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=676800.0, ans=0.125 2023-09-30 10:03:10,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 10:03:13,673 INFO [train.py:1039] (2/4) Epoch 20, batch 600, loss[loss=0.1665, simple_loss=0.2513, pruned_loss=0.04082, over 24435.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2546, pruned_loss=0.0518, over 4474928.71 frames. ], batch size: 69, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:03:13,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 10:03:13,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:03:15,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:03:15,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:20,159 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.08 vs. limit=12.0 2023-09-30 10:03:25,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:03:25,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:03:28,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 10:03:31,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:03:33,018 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.840e+02 2.105e+02 2.465e+02 3.570e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 10:03:33,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:03:36,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:37,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 10:03:37,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:03:39,484 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=676933.3333333334, ans=0.125 2023-09-30 10:03:42,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 10:03:44,323 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=677000.0, ans=0.125 2023-09-30 10:03:46,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:03:46,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:48,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:03:54,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:03:54,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:03:54,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:00,059 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.49 vs. limit=6.0 2023-09-30 10:04:02,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:04:06,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:06,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:04:06,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:04:07,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=677066.6666666666, ans=0.0 2023-09-30 10:04:12,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 10:04:15,299 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.29 vs. limit=15.0 2023-09-30 10:04:18,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:04:18,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:04:23,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 10:04:25,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:04:26,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 10:04:27,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:04:28,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:04:34,469 INFO [train.py:1039] (2/4) Epoch 20, batch 650, loss[loss=0.1925, simple_loss=0.2575, pruned_loss=0.06381, over 23806.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2539, pruned_loss=0.05184, over 4523275.77 frames. ], batch size: 164, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:04:36,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:04:37,701 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:04:39,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:04:42,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:04:43,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:04:44,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=677200.0, ans=0.125 2023-09-30 10:04:46,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 10:04:48,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:53,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:04:53,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:04:57,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:01,278 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 10:05:04,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:04,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:09,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:09,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:05:11,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:11,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:12,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:05:14,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:15,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:05:17,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:05:17,375 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 10:05:17,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:17,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:20,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:20,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:21,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:22,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:05:23,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 10:05:23,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:05:23,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:05:27,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:05:27,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:29,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:05:31,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 10:05:33,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 10:05:33,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:33,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:33,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:05:34,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:35,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:41,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:41,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:41,916 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=677466.6666666666, ans=0.07 2023-09-30 10:05:43,105 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:46,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:46,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:05:46,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:47,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=677466.6666666666, ans=0.1 2023-09-30 10:05:48,426 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.44 vs. limit=15.0 2023-09-30 10:05:52,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=677466.6666666666, ans=0.0 2023-09-30 10:05:53,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:05:53,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:53,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:05:55,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:56,756 INFO [train.py:1039] (2/4) Epoch 20, batch 700, loss[loss=0.1823, simple_loss=0.2656, pruned_loss=0.04951, over 24698.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2522, pruned_loss=0.05134, over 4551070.17 frames. ], batch size: 73, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:06:00,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 10:06:02,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 10:06:03,103 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=677533.3333333334, ans=0.95 2023-09-30 10:06:04,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 10:06:04,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:06,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:06:08,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 10:06:14,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:15,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:06:17,045 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.862e+02 2.095e+02 2.460e+02 3.900e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 10:06:18,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:18,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:06:20,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:06:23,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:24,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:06:25,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:06:26,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 10:06:26,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=677600.0, ans=0.0 2023-09-30 10:06:29,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 10:06:35,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:06:35,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:06:37,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:06:41,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:06:41,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 10:06:45,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:46,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=677733.3333333334, ans=0.125 2023-09-30 10:06:46,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=677733.3333333334, ans=0.07 2023-09-30 10:06:47,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:06:47,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 10:06:47,720 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=677733.3333333334, ans=0.125 2023-09-30 10:06:50,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:52,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:55,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:00,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:07:00,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 10:07:06,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 10:07:06,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 10:07:10,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:10,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:11,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:13,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:13,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 10:07:18,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 10:07:18,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 10:07:18,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 10:07:19,749 INFO [train.py:1039] (2/4) Epoch 20, batch 750, loss[loss=0.1827, simple_loss=0.2517, pruned_loss=0.05687, over 23485.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2519, pruned_loss=0.05112, over 4590743.82 frames. ], batch size: 285, lr: 5.19e-03, grad_scale: 8.0 2023-09-30 10:07:21,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 10:07:21,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 10:07:21,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:07:21,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=677866.6666666666, ans=0.125 2023-09-30 10:07:23,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 10:07:23,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=677866.6666666666, ans=0.0 2023-09-30 10:07:24,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:24,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:27,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:30,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:30,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:07:30,942 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=677866.6666666666, ans=0.125 2023-09-30 10:07:32,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:33,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:07:35,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:07:36,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:07:40,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:40,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:40,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 10:07:43,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:07:43,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:45,245 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:46,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:07:47,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 10:07:47,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:49,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 10:07:49,940 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 10:07:50,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 10:07:50,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:07:50,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:07:53,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:07:55,182 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=678000.0, ans=0.0 2023-09-30 10:07:59,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:59,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:07:59,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:07:59,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=678000.0, ans=0.2 2023-09-30 10:08:02,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:08:04,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:04,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 10:08:05,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:08:05,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 10:08:07,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:08:07,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=678066.6666666666, ans=0.1 2023-09-30 10:08:11,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:08:12,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 10:08:12,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:19,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:22,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:08:22,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:24,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:08:28,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 10:08:28,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:28,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:32,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:32,434 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=678133.3333333334, ans=0.125 2023-09-30 10:08:33,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:36,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:37,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:08:40,976 INFO [train.py:1039] (2/4) Epoch 20, batch 800, loss[loss=0.1634, simple_loss=0.2449, pruned_loss=0.04101, over 24653.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2525, pruned_loss=0.05105, over 4625962.86 frames. ], batch size: 65, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:08:42,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=678200.0, ans=0.125 2023-09-30 10:08:44,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:44,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:47,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:47,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:48,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:48,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:50,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:54,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:55,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:08:58,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 10:09:00,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:01,437 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.889e+02 2.125e+02 2.539e+02 3.349e+02, threshold=4.249e+02, percent-clipped=0.0 2023-09-30 10:09:01,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:09:01,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:03,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:03,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 10:09:03,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:03,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 10:09:07,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:09,590 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=678266.6666666666, ans=0.125 2023-09-30 10:09:10,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:12,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:09:12,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:16,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:16,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:17,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=678333.3333333334, ans=0.125 2023-09-30 10:09:22,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=678333.3333333334, ans=0.125 2023-09-30 10:09:23,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:09:23,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:09:24,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 10:09:27,126 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 10:09:27,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 10:09:27,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:09:27,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:09:29,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:31,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:09:36,365 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 10:09:37,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 10:09:39,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:09:39,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:09:42,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:09:45,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:45,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 10:09:47,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:51,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 10:09:58,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:10:00,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:10:02,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 10:10:04,510 INFO [train.py:1039] (2/4) Epoch 20, batch 850, loss[loss=0.1587, simple_loss=0.233, pruned_loss=0.04216, over 24335.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2529, pruned_loss=0.05154, over 4643496.65 frames. ], batch size: 61, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:10:04,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:10:04,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:07,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 10:10:07,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:07,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:10:08,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:10,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:10:11,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:10:13,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 10:10:13,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 10:10:13,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 10:10:14,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:10:14,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:10:17,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:17,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:17,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:10:21,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=678600.0, ans=0.125 2023-09-30 10:10:23,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:23,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:25,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 10:10:27,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 10:10:30,815 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:32,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 10:10:38,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 10:10:38,533 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=22.5 2023-09-30 10:10:39,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 10:10:43,214 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 10:10:43,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:43,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:10:43,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:10:46,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:46,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=678666.6666666666, ans=0.0 2023-09-30 10:10:47,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:47,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 10:10:49,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:51,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:51,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:10:51,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:10:52,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:10:54,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=678733.3333333334, ans=0.125 2023-09-30 10:10:55,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:10:55,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 10:10:58,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=678733.3333333334, ans=0.0 2023-09-30 10:11:00,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:11:00,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:02,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:11:02,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:03,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:08,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:11:09,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:11:13,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:11:13,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:14,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:11:23,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:11:24,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:24,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 10:11:26,247 INFO [train.py:1039] (2/4) Epoch 20, batch 900, loss[loss=0.1437, simple_loss=0.2199, pruned_loss=0.03373, over 24311.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2537, pruned_loss=0.05222, over 4666463.95 frames. ], batch size: 56, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:11:26,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:26,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:27,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 10:11:36,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:11:37,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:39,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 10:11:40,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:11:40,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 10:11:43,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:11:45,398 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.863e+02 2.352e+02 2.808e+02 3.950e+02, threshold=4.705e+02, percent-clipped=0.0 2023-09-30 10:11:45,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:45,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:11:45,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:11:45,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:11:55,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:55,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:55,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:11:59,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:05,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 10:12:05,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:12:13,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:12:13,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=679066.6666666666, ans=0.1 2023-09-30 10:12:15,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:12:15,169 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 10:12:16,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 10:12:22,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:12:22,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:12:22,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:12:22,876 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.68 vs. limit=6.0 2023-09-30 10:12:27,687 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=679066.6666666666, ans=0.125 2023-09-30 10:12:28,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:30,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:12:31,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 10:12:31,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:34,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 10:12:36,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:12:37,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:39,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:12:39,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:12:43,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 10:12:43,070 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 10:12:43,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:12:43,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 10:12:47,492 INFO [train.py:1039] (2/4) Epoch 20, batch 950, loss[loss=0.1838, simple_loss=0.242, pruned_loss=0.0628, over 23431.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2542, pruned_loss=0.05239, over 4678013.41 frames. ], batch size: 285, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:12:47,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:50,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 10:12:55,042 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=679200.0, ans=0.2 2023-09-30 10:12:56,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:00,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:13:03,204 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 10:13:05,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.91 vs. limit=15.0 2023-09-30 10:13:06,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:07,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:07,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:08,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:13:09,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 10:13:11,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:13:13,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:13,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 10:13:14,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:18,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:13:19,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 10:13:22,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:13:24,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:26,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:13:32,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:13:32,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:36,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 10:13:37,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:13:37,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:13:39,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:40,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:40,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:13:44,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 10:13:46,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:13:47,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:47,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:47,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 10:13:49,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:49,298 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:13:49,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 10:13:49,547 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:13:53,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:13:54,170 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=679466.6666666666, ans=0.125 2023-09-30 10:13:55,985 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=679466.6666666666, ans=0.125 2023-09-30 10:13:57,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:14:00,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:02,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 10:14:02,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 10:14:07,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:14:10,704 INFO [train.py:1039] (2/4) Epoch 20, batch 1000, loss[loss=0.171, simple_loss=0.2578, pruned_loss=0.04211, over 24544.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2534, pruned_loss=0.05249, over 4682455.64 frames. ], batch size: 71, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:14:11,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=679533.3333333334, ans=0.125 2023-09-30 10:14:13,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 10:14:15,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:19,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=679533.3333333334, ans=0.125 2023-09-30 10:14:20,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:14:20,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 10:14:20,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 10:14:20,711 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=679533.3333333334, ans=0.125 2023-09-30 10:14:25,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:25,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:28,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:29,654 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.877e+02 1.981e+02 2.279e+02 3.470e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 10:14:29,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 10:14:36,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 10:14:37,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 10:14:37,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:38,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 10:14:39,710 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 10:14:41,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 10:14:43,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:45,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:55,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:55,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:14:55,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=679666.6666666666, ans=0.2 2023-09-30 10:14:56,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:56,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:56,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 10:14:56,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:57,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:14:58,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:58,572 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 10:15:01,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 10:15:04,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 10:15:04,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 10:15:06,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:15:15,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:15,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:15:15,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:17,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:15:17,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=679800.0, ans=0.025 2023-09-30 10:15:19,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 10:15:21,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:15:21,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 10:15:22,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 10:15:24,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:24,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:15:26,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:15:30,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:15:31,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:15:34,045 INFO [train.py:1039] (2/4) Epoch 20, batch 1050, loss[loss=0.1642, simple_loss=0.2528, pruned_loss=0.03781, over 24316.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2515, pruned_loss=0.05158, over 4688232.56 frames. ], batch size: 74, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:15:35,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:15:37,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:15:40,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:15:41,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:44,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:15:46,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:15:48,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:15:50,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:15:52,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:15:52,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:15:53,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:15:55,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 10:15:55,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:15:55,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 10:15:58,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:59,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 10:15:59,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:16:04,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:16:04,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:16:04,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:16:07,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 10:16:07,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 10:16:09,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:16:11,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 10:16:13,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 10:16:14,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:15,819 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.46 vs. limit=6.0 2023-09-30 10:16:19,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:16:22,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:16:23,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:16:23,827 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.53 vs. limit=6.0 2023-09-30 10:16:24,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:16:29,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:16:32,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 10:16:35,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 10:16:35,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 10:16:35,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:35,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:16:38,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 10:16:41,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:16:42,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:42,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:16:44,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:44,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:46,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=680133.3333333334, ans=0.125 2023-09-30 10:16:48,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:48,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 10:16:49,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:49,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 10:16:49,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 10:16:51,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:16:55,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:16:58,139 INFO [train.py:1039] (2/4) Epoch 20, batch 1100, loss[loss=0.1753, simple_loss=0.2435, pruned_loss=0.05348, over 23778.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2511, pruned_loss=0.05108, over 4690825.76 frames. ], batch size: 164, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:17:01,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:17:06,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:17:09,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:17:10,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:10,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 10:17:12,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:17:15,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:17:16,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:17:17,916 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.800e+02 1.958e+02 2.202e+02 3.142e+02, threshold=3.917e+02, percent-clipped=0.0 2023-09-30 10:17:19,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:17:19,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 10:17:21,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:17:23,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:23,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:17:26,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:17:27,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:17:33,918 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:17:36,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 10:17:36,464 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 10:17:37,386 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.40 vs. limit=15.0 2023-09-30 10:17:37,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:40,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:41,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:17:41,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:17:43,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 10:17:43,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:17:43,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:17:44,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=680333.3333333334, ans=0.125 2023-09-30 10:17:45,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:17:45,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:45,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 10:17:49,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:17:50,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 10:17:53,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:17:59,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:18:02,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 10:18:04,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 10:18:05,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:08,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:08,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:11,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 10:18:12,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:18:12,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:14,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 10:18:14,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:18:16,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 10:18:16,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:18:16,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:18:17,124 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=22.5 2023-09-30 10:18:17,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:18:21,501 INFO [train.py:1039] (2/4) Epoch 20, batch 1150, loss[loss=0.1681, simple_loss=0.2398, pruned_loss=0.04822, over 18348.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2511, pruned_loss=0.05072, over 4687337.15 frames. ], batch size: 40, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:18:24,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:28,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:18:30,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:30,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:18:30,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 10:18:30,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:34,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 10:18:36,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:36,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:18:40,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 10:18:43,855 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:47,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:48,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:18:48,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 10:18:48,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:18:50,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:53,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 10:18:55,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:55,900 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=680666.6666666666, ans=0.125 2023-09-30 10:18:57,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:57,367 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=680666.6666666666, ans=0.0 2023-09-30 10:18:57,778 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.15 vs. limit=15.0 2023-09-30 10:19:06,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:15,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:17,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 10:19:17,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:17,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:26,220 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 10:19:27,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:33,524 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 10:19:38,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:38,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:19:39,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:19:39,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:19:43,292 INFO [train.py:1039] (2/4) Epoch 20, batch 1200, loss[loss=0.1691, simple_loss=0.2412, pruned_loss=0.04846, over 23459.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2515, pruned_loss=0.0505, over 4695454.56 frames. ], batch size: 285, lr: 5.18e-03, grad_scale: 32.0 2023-09-30 10:19:44,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:49,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:19:49,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:19:51,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:19:51,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:52,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:19:53,057 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=680866.6666666666, ans=0.2 2023-09-30 10:19:55,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:19:57,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:19:59,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:59,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:02,633 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.868e+02 2.067e+02 2.397e+02 3.713e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 10:20:02,854 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 10:20:04,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 10:20:08,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:20:11,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:20:14,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:15,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:20:15,813 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 10:20:17,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:25,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:20:25,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:20:25,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 10:20:27,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:20:29,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=681000.0, ans=0.0 2023-09-30 10:20:30,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 10:20:34,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 10:20:34,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:36,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:38,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:38,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:20:39,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=681066.6666666666, ans=0.2 2023-09-30 10:20:41,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:41,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:20:43,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:20:43,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 10:20:44,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:20:44,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:20:44,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:20:48,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:20:48,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:52,503 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:20:55,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:20:57,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 10:21:01,140 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 10:21:03,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:05,359 INFO [train.py:1039] (2/4) Epoch 20, batch 1250, loss[loss=0.1938, simple_loss=0.2572, pruned_loss=0.0652, over 23635.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2528, pruned_loss=0.05116, over 4686248.40 frames. ], batch size: 232, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:21:06,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:21:07,255 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=681200.0, ans=0.09899494936611666 2023-09-30 10:21:09,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:21:10,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:21:14,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 10:21:17,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:21:19,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:19,869 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.32 vs. limit=15.0 2023-09-30 10:21:20,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 10:21:23,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:21:25,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:21:29,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:21:30,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:31,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:21:31,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:33,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:21:38,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:21:38,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:21:38,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:40,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:41,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:43,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:21:45,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:21:51,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 10:21:51,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:21:53,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:21:54,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 10:21:54,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:54,652 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 10:21:56,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:56,087 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:58,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:22:02,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:22:02,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:22:04,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 10:22:05,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 10:22:05,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 10:22:08,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:09,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 10:22:11,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:15,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:22:15,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:22:16,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 10:22:16,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:22:18,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:22:18,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:22:18,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:22:19,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 10:22:23,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:25,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:22:27,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:22:28,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:22:30,059 INFO [train.py:1039] (2/4) Epoch 20, batch 1300, loss[loss=0.1504, simple_loss=0.2268, pruned_loss=0.037, over 24440.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2533, pruned_loss=0.05146, over 4688244.81 frames. ], batch size: 58, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:22:31,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:32,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=681533.3333333334, ans=0.2 2023-09-30 10:22:33,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 10:22:37,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:41,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:22:41,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:22:44,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:44,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:22:46,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 10:22:52,544 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.925e+02 2.165e+02 2.491e+02 3.486e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 10:22:52,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:22:54,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:22:56,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 10:22:57,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:22:59,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=681600.0, ans=0.1 2023-09-30 10:23:03,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:04,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:06,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:23:06,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:07,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:23:07,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:23:07,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 10:23:14,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:23:16,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:23:17,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 10:23:17,752 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:23:19,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=681733.3333333334, ans=0.125 2023-09-30 10:23:20,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:23:23,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:23:23,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 10:23:25,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:25,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 10:23:27,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:31,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:31,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:23:34,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 10:23:34,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 10:23:36,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 10:23:40,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:23:43,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 10:23:45,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:47,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=681800.0, ans=0.5 2023-09-30 10:23:52,105 INFO [train.py:1039] (2/4) Epoch 20, batch 1350, loss[loss=0.1736, simple_loss=0.2316, pruned_loss=0.05781, over 23613.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2525, pruned_loss=0.05153, over 4686407.70 frames. ], batch size: 256, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:23:52,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 10:23:56,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:23:57,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=681866.6666666666, ans=0.125 2023-09-30 10:24:00,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:03,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:24:04,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:05,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=681866.6666666666, ans=0.0 2023-09-30 10:24:06,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:24:07,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:10,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:12,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 10:24:15,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:15,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:24:18,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 10:24:18,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:24:19,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:24:19,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 10:24:21,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 10:24:24,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 10:24:27,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:27,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 10:24:40,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:50,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 10:24:53,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:54,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 10:24:54,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:54,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:57,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=682133.3333333334, ans=6.0 2023-09-30 10:24:58,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:25:01,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 10:25:01,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:25:03,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=682133.3333333334, ans=0.2 2023-09-30 10:25:08,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 10:25:09,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 10:25:14,415 INFO [train.py:1039] (2/4) Epoch 20, batch 1400, loss[loss=0.1609, simple_loss=0.2395, pruned_loss=0.04114, over 24294.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2512, pruned_loss=0.05093, over 4686318.98 frames. ], batch size: 61, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:25:16,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 10:25:18,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:25:21,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:25:23,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:25:29,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 10:25:31,231 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.53 vs. limit=15.0 2023-09-30 10:25:32,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 10:25:37,259 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.836e+02 1.987e+02 2.257e+02 3.606e+02, threshold=3.975e+02, percent-clipped=0.0 2023-09-30 10:25:37,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=682266.6666666666, ans=0.0 2023-09-30 10:25:39,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=682266.6666666666, ans=0.125 2023-09-30 10:25:40,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=682266.6666666666, ans=0.2 2023-09-30 10:25:42,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:25:42,859 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=682266.6666666666, ans=0.0 2023-09-30 10:25:44,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:25:47,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:25:47,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:25:51,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=682333.3333333334, ans=0.125 2023-09-30 10:25:52,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:25:52,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:26:03,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:03,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:05,644 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=682400.0, ans=0.0 2023-09-30 10:26:05,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=682400.0, ans=0.125 2023-09-30 10:26:09,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 10:26:09,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:26:09,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:26:10,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:26:10,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:11,310 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=682400.0, ans=0.0 2023-09-30 10:26:12,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:26:12,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:26:13,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:26:14,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 10:26:14,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:26:21,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=682466.6666666666, ans=0.0 2023-09-30 10:26:22,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:25,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:26:27,464 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=682466.6666666666, ans=0.125 2023-09-30 10:26:31,159 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 10:26:32,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:26:34,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:26:35,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:26:35,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:37,954 INFO [train.py:1039] (2/4) Epoch 20, batch 1450, loss[loss=0.1861, simple_loss=0.2536, pruned_loss=0.0593, over 23835.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.251, pruned_loss=0.05086, over 4692925.54 frames. ], batch size: 212, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:26:38,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:26:40,008 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682533.3333333334, ans=0.1 2023-09-30 10:26:41,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:26:44,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:26:44,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:44,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:26:50,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:51,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:26:53,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:53,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 10:26:54,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:26:56,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 10:26:56,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:56,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:26:56,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 10:26:59,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:26:59,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:27:00,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 10:27:00,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:03,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:27:04,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:06,400 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:27:07,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:11,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:27:11,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:27:16,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:27:16,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:16,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:17,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:27:17,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:17,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:23,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 10:27:26,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:27:31,130 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 10:27:32,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:32,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:27:34,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:36,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 10:27:39,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:39,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=682733.3333333334, ans=0.0 2023-09-30 10:27:41,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 10:27:42,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 10:27:43,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:46,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:27:47,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:51,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 10:27:51,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 10:27:52,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 10:27:54,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:54,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:28:01,507 INFO [train.py:1039] (2/4) Epoch 20, batch 1500, loss[loss=0.1787, simple_loss=0.2638, pruned_loss=0.04682, over 24406.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2513, pruned_loss=0.05079, over 4701299.57 frames. ], batch size: 77, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:28:06,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 10:28:07,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:28:07,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:28:09,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:09,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:11,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:28:11,278 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=682866.6666666666, ans=0.0 2023-09-30 10:28:12,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 10:28:14,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:28:14,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:28:14,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:16,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:28:16,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:28:17,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:18,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=682933.3333333334, ans=0.0 2023-09-30 10:28:24,678 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.891e+02 2.112e+02 2.423e+02 4.358e+02, threshold=4.223e+02, percent-clipped=4.0 2023-09-30 10:28:24,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:24,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 10:28:24,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:28:25,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:28:26,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:29,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 10:28:29,854 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:28:33,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 10:28:35,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:36,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 10:28:39,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:28:40,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=683000.0, ans=0.125 2023-09-30 10:28:40,602 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=15.0 2023-09-30 10:28:41,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:28:42,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:42,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:28:44,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 10:28:45,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:28:45,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:47,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 10:28:47,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:54,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:28:54,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 10:28:59,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:29:02,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:29:06,572 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 10:29:06,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:06,666 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 10:29:09,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:12,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:12,110 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 10:29:12,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:29:16,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 10:29:18,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:21,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:21,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:21,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:22,789 INFO [train.py:1039] (2/4) Epoch 20, batch 1550, loss[loss=0.1917, simple_loss=0.2608, pruned_loss=0.06127, over 23269.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.252, pruned_loss=0.05045, over 4714082.62 frames. ], batch size: 119, lr: 5.17e-03, grad_scale: 8.0 2023-09-30 10:29:22,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:22,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:29:26,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 10:29:26,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 10:29:26,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:29:28,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 10:29:28,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 10:29:31,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:33,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:33,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:29:33,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:29:35,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:35,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:36,956 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 10:29:38,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:38,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:29:39,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:29:42,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:29:43,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 10:29:43,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:43,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 10:29:45,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 10:29:45,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 10:29:47,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:49,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:29:52,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:55,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 10:29:55,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 10:30:05,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:08,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:30:08,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:30:08,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:30:10,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 10:30:15,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:30:18,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:20,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:30:23,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:30:23,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:24,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 10:30:25,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:26,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:30:27,080 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=683400.0, ans=0.0 2023-09-30 10:30:28,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:28,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:30:29,821 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 10:30:31,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:38,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 10:30:44,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:45,996 INFO [train.py:1039] (2/4) Epoch 20, batch 1600, loss[loss=0.2036, simple_loss=0.2739, pruned_loss=0.0667, over 23602.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2526, pruned_loss=0.05058, over 4703269.70 frames. ], batch size: 256, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:30:46,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:46,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 10:30:47,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:49,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:49,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:30:49,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:30:50,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:30:51,318 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=683533.3333333334, ans=0.1 2023-09-30 10:30:54,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:54,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 10:30:54,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 10:30:56,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 10:30:58,709 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:01,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 10:31:03,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:31:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:31:06,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=683600.0, ans=0.0 2023-09-30 10:31:09,451 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.873e+02 2.054e+02 2.261e+02 3.957e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 10:31:11,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:31:14,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 10:31:16,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:31:17,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 10:31:19,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:19,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 10:31:25,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 10:31:34,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:35,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 10:31:37,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:37,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:37,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:31:40,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 10:31:42,145 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=683733.3333333334, ans=0.2 2023-09-30 10:31:45,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 10:31:47,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:31:48,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:48,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:48,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:31:49,221 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.11 vs. limit=15.0 2023-09-30 10:31:52,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:31:53,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:31:55,200 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:31:55,799 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.13 vs. limit=22.5 2023-09-30 10:32:00,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:00,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=683800.0, ans=0.2 2023-09-30 10:32:02,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:02,236 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=683800.0, ans=0.125 2023-09-30 10:32:04,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 10:32:04,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:32:05,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 10:32:08,691 INFO [train.py:1039] (2/4) Epoch 20, batch 1650, loss[loss=0.1765, simple_loss=0.2469, pruned_loss=0.05304, over 23615.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2537, pruned_loss=0.05087, over 4706323.92 frames. ], batch size: 149, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:32:09,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:09,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=683866.6666666666, ans=0.125 2023-09-30 10:32:11,548 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.25 vs. limit=15.0 2023-09-30 10:32:11,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:32:12,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=683866.6666666666, ans=0.1 2023-09-30 10:32:13,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:32:13,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 10:32:13,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 10:32:13,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 10:32:13,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 10:32:17,266 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=683866.6666666666, ans=15.0 2023-09-30 10:32:18,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:20,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:20,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:32:20,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:32:21,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:24,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 10:32:27,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:32:27,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:27,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:32:27,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:32:28,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 10:32:28,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 10:32:35,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:32:37,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=683933.3333333334, ans=0.0 2023-09-30 10:32:38,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:32:47,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 10:32:47,858 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:32:48,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:32:49,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=684000.0, ans=0.1 2023-09-30 10:32:51,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 10:32:55,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:32:58,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:32:58,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:58,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:00,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:33:00,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:04,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:04,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:06,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:06,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:07,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:09,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=684066.6666666666, ans=0.125 2023-09-30 10:33:10,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:33:12,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:13,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 10:33:15,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:16,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 10:33:17,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=684133.3333333334, ans=0.1 2023-09-30 10:33:19,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 10:33:19,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 10:33:19,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:19,574 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=684133.3333333334, ans=0.05 2023-09-30 10:33:20,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:33:20,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:20,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:20,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 10:33:24,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:25,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:33:25,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:30,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 10:33:30,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=684200.0, ans=0.125 2023-09-30 10:33:32,085 INFO [train.py:1039] (2/4) Epoch 20, batch 1700, loss[loss=0.1725, simple_loss=0.2528, pruned_loss=0.04615, over 24650.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2538, pruned_loss=0.05124, over 4692718.12 frames. ], batch size: 65, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:33:33,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:33,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:33:33,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 10:33:36,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:33:36,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:33:36,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:38,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:33:38,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:33:39,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 10:33:41,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:33:51,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:54,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:33:55,786 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.837e+02 2.036e+02 2.305e+02 3.271e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-30 10:33:59,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:33:59,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:00,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:34:00,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:05,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 10:34:05,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:34:05,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:08,091 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.60 vs. limit=15.0 2023-09-30 10:34:09,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:34:11,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:34:12,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 10:34:12,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 10:34:14,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:15,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 10:34:17,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:34:24,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.68 vs. limit=22.5 2023-09-30 10:34:27,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:27,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:28,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:30,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:34:30,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 10:34:31,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:33,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:33,397 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 10:34:34,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:34:34,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:36,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:36,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:34:36,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=684466.6666666666, ans=0.2 2023-09-30 10:34:39,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:39,813 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:34:41,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:43,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:34:43,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:46,115 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=684466.6666666666, ans=0.125 2023-09-30 10:34:47,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:48,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 10:34:49,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=684466.6666666666, ans=0.1 2023-09-30 10:34:50,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:50,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=684466.6666666666, ans=0.07 2023-09-30 10:34:51,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:53,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 10:34:54,025 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:34:55,008 INFO [train.py:1039] (2/4) Epoch 20, batch 1750, loss[loss=0.1728, simple_loss=0.2483, pruned_loss=0.0486, over 23252.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2527, pruned_loss=0.05048, over 4713382.15 frames. ], batch size: 105, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:34:55,573 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=684533.3333333334, ans=0.2 2023-09-30 10:35:00,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:01,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:03,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:35:03,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 10:35:03,533 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:35:06,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:35:06,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:09,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=684600.0, ans=0.125 2023-09-30 10:35:11,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 10:35:13,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:15,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=684600.0, ans=0.125 2023-09-30 10:35:15,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=684600.0, ans=0.125 2023-09-30 10:35:17,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 10:35:17,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:19,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:35:20,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:35:22,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 10:35:24,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:35:25,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 10:35:27,751 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.03 vs. limit=12.0 2023-09-30 10:35:31,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=684666.6666666666, ans=0.125 2023-09-30 10:35:32,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:35:35,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:35:35,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:38,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:38,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:40,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:35:43,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:43,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=684733.3333333334, ans=0.125 2023-09-30 10:35:46,471 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:46,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:48,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 10:35:50,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:52,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 10:35:54,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:35:55,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:55,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:35:57,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=684733.3333333334, ans=0.1 2023-09-30 10:35:59,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:36:00,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:36:01,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:01,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:36:02,935 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=684800.0, ans=0.125 2023-09-30 10:36:05,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:08,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:08,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=684800.0, ans=0.1 2023-09-30 10:36:10,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:36:11,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 10:36:11,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:13,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:36:13,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:13,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:36:14,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:36:16,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:36:16,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=684866.6666666666, ans=0.2 2023-09-30 10:36:18,214 INFO [train.py:1039] (2/4) Epoch 20, batch 1800, loss[loss=0.1905, simple_loss=0.2411, pruned_loss=0.06993, over 19325.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2505, pruned_loss=0.0499, over 4690846.96 frames. ], batch size: 389, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:36:18,429 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:36:20,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:21,520 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.74 vs. limit=22.5 2023-09-30 10:36:22,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:36:24,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:27,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:36:30,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:36:32,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:34,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:35,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:37,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:36:37,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:38,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 10:36:39,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:41,811 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.858e+02 2.042e+02 2.295e+02 4.168e+02, threshold=4.085e+02, percent-clipped=1.0 2023-09-30 10:36:42,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:46,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 10:36:48,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 10:36:48,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 10:36:49,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:51,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:51,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:53,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:37:02,091 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 10:37:02,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=685000.0, ans=0.125 2023-09-30 10:37:03,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:37:05,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:07,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 10:37:07,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 10:37:08,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:37:10,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:37:11,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:37:15,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 10:37:19,829 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=685066.6666666666, ans=0.0 2023-09-30 10:37:23,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:37:24,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 10:37:25,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:37:25,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:25,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:37:27,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 10:37:30,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:37:30,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:37:33,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 10:37:33,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:35,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=685133.3333333334, ans=0.125 2023-09-30 10:37:36,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:36,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:37:36,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:38,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:39,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:37:41,844 INFO [train.py:1039] (2/4) Epoch 20, batch 1850, loss[loss=0.1896, simple_loss=0.272, pruned_loss=0.05357, over 24095.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2516, pruned_loss=0.05073, over 4697994.17 frames. ], batch size: 80, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:37:42,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:37:42,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:45,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:37:45,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:37:47,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=685200.0, ans=0.0 2023-09-30 10:37:51,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:37:51,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 10:37:54,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 10:37:59,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 10:37:59,437 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=685266.6666666666, ans=0.125 2023-09-30 10:37:59,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=685266.6666666666, ans=0.0 2023-09-30 10:38:04,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:04,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 10:38:04,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:38:14,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:38:17,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 10:38:19,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:38:20,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:25,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 10:38:25,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:25,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:38:27,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:38:28,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:38:30,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:38:32,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:38:32,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=685400.0, ans=0.125 2023-09-30 10:38:33,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:33,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:38:33,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:37,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:37,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=685400.0, ans=0.0 2023-09-30 10:38:39,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:38:41,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 10:38:43,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:47,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:38:49,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:38:49,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 10:38:49,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 10:38:49,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=685466.6666666666, ans=0.125 2023-09-30 10:38:51,446 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 10:38:52,862 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 10:38:54,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:38:54,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:54,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:38:56,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:57,439 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 10:38:57,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:38:57,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:59,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:39:00,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:39:00,864 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:39:02,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:02,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 10:39:03,555 INFO [train.py:1039] (2/4) Epoch 20, batch 1900, loss[loss=0.1977, simple_loss=0.2764, pruned_loss=0.05952, over 24382.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2523, pruned_loss=0.05018, over 4708240.45 frames. ], batch size: 77, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:39:05,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:05,103 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 10:39:05,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:39:06,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:08,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=685533.3333333334, ans=0.09899494936611666 2023-09-30 10:39:13,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:16,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:39:17,121 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 10:39:18,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 10:39:18,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:39:20,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:39:20,381 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 10:39:20,438 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 10:39:24,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 10:39:25,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:39:26,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=685600.0, ans=0.125 2023-09-30 10:39:27,715 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.870e+02 2.135e+02 2.444e+02 3.596e+02, threshold=4.270e+02, percent-clipped=0.0 2023-09-30 10:39:29,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 10:39:32,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 10:39:39,099 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=685666.6666666666, ans=0.125 2023-09-30 10:39:41,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 10:39:45,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 10:39:45,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:45,833 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 10:39:45,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 10:39:47,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 10:39:47,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 10:39:47,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:39:52,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 10:39:56,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:39:59,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:59,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 10:40:01,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:40:03,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 10:40:03,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:10,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:40:10,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:40:10,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:40:12,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:40:13,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:40:15,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:40:15,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:40:18,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:18,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:22,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:40:22,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:40:23,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:23,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:28,013 INFO [train.py:1039] (2/4) Epoch 20, batch 1950, loss[loss=0.1806, simple_loss=0.2701, pruned_loss=0.04551, over 24288.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2528, pruned_loss=0.05036, over 4703713.80 frames. ], batch size: 74, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:40:28,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:29,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:40:29,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:29,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:40:32,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 10:40:34,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:40:34,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:35,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=685866.6666666666, ans=0.125 2023-09-30 10:40:36,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:39,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:40:39,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:39,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:42,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:40:45,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:45,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:40:45,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:40:45,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:48,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:51,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:51,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:51,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:40:51,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 10:40:53,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:40:54,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:40:54,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:57,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=685933.3333333334, ans=0.0 2023-09-30 10:40:58,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:02,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:41:02,911 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=686000.0, ans=0.0 2023-09-30 10:41:05,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:41:06,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=686000.0, ans=0.0 2023-09-30 10:41:07,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:41:07,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:09,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 10:41:09,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:14,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:41:15,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:41:15,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:25,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:25,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:28,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:32,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:36,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:41:36,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:37,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 10:41:37,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:41:38,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:39,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 10:41:41,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:41:46,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:47,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:41:47,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:50,519 INFO [train.py:1039] (2/4) Epoch 20, batch 2000, loss[loss=0.1805, simple_loss=0.2656, pruned_loss=0.04774, over 23988.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2532, pruned_loss=0.05082, over 4698951.90 frames. ], batch size: 80, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:41:50,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:41:53,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:55,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 10:41:56,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:59,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:42:02,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 10:42:04,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:42:04,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:42:06,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:42:06,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=686266.6666666666, ans=0.1 2023-09-30 10:42:08,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 10:42:09,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:10,434 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.25 vs. limit=15.0 2023-09-30 10:42:11,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:11,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:13,099 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.882e+02 2.052e+02 2.299e+02 3.277e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 10:42:13,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 10:42:13,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:42:16,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 10:42:16,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:20,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:42:21,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:42:21,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:23,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:24,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:26,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 10:42:29,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 10:42:29,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:29,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:42:35,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:36,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:42:36,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:38,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:42:39,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:39,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:42,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:42,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:44,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:48,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:48,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 10:42:56,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:42:56,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:42:57,508 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.90 vs. limit=22.5 2023-09-30 10:42:58,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=686466.6666666666, ans=0.125 2023-09-30 10:43:01,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:01,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:43:05,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:08,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:08,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:08,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:43:08,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:43:10,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:10,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:11,745 INFO [train.py:1039] (2/4) Epoch 20, batch 2050, loss[loss=0.1682, simple_loss=0.2484, pruned_loss=0.04405, over 24660.00 frames. ], tot_loss[loss=0.177, simple_loss=0.253, pruned_loss=0.05054, over 4703495.71 frames. ], batch size: 65, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:43:12,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=686533.3333333334, ans=0.1 2023-09-30 10:43:13,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:14,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:20,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:43:21,386 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:43:23,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:43:24,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:25,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:43:27,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 10:43:27,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:43:31,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:43:31,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:43:39,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:40,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:43,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 10:43:45,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:47,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 10:43:48,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:51,160 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.01 vs. limit=22.5 2023-09-30 10:43:51,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:55,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:43:57,564 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:43:57,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:59,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:44:00,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:44:00,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:44:05,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:06,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:44:10,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:44:10,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:44:13,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:18,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:44:19,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 10:44:25,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:26,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:44:29,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:44:32,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 10:44:34,091 INFO [train.py:1039] (2/4) Epoch 20, batch 2100, loss[loss=0.1704, simple_loss=0.2585, pruned_loss=0.04119, over 24695.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2526, pruned_loss=0.05054, over 4709656.21 frames. ], batch size: 68, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:44:34,594 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:44:35,792 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 10:44:35,793 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:35,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:37,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:44:37,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:37,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 10:44:37,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 10:44:39,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:43,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:44:44,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:44:47,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:47,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:44:47,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 10:44:47,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=686866.6666666666, ans=10.0 2023-09-30 10:44:49,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:44:49,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 10:44:49,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 10:44:52,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:44:52,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:44:52,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 10:44:52,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 10:44:58,445 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.064e+02 2.437e+02 3.000e+02 4.850e+02, threshold=4.873e+02, percent-clipped=5.0 2023-09-30 10:44:58,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 10:44:58,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:45:02,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:03,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:45:07,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:45:07,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 10:45:09,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:09,232 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:45:12,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 10:45:13,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:13,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 10:45:13,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 10:45:13,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 10:45:16,794 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=12.0 2023-09-30 10:45:17,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:45:19,096 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:45:22,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:26,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:26,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 10:45:26,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:26,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:28,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:28,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 10:45:30,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 10:45:30,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 10:45:33,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:45:37,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:45:39,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 10:45:44,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:47,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:45:47,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=687133.3333333334, ans=0.125 2023-09-30 10:45:49,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:45:49,026 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:45:49,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:45:50,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:45:52,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:52,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:45:52,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:45:52,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:54,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=687133.3333333334, ans=0.125 2023-09-30 10:45:55,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 10:45:57,363 INFO [train.py:1039] (2/4) Epoch 20, batch 2150, loss[loss=0.1752, simple_loss=0.2582, pruned_loss=0.04607, over 24325.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2512, pruned_loss=0.04976, over 4715772.79 frames. ], batch size: 77, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:45:57,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 10:45:57,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:59,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:59,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:45:59,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:45:59,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:46:05,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:46:06,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:07,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:09,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:46:09,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:09,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:46:13,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:14,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=687266.6666666666, ans=0.95 2023-09-30 10:46:15,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:46:15,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:46:21,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:21,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 10:46:26,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:27,355 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.80 vs. limit=15.0 2023-09-30 10:46:28,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:46:28,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:28,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:29,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:29,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:46:31,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:31,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:46:31,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:46:32,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 10:46:35,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:46:35,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:35,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:37,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:46:38,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:46:42,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:42,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:46:45,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:45,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 10:46:47,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:46:49,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:51,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:51,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=687400.0, ans=0.2 2023-09-30 10:46:52,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:54,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:46:54,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:46:56,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:56,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 10:46:58,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 10:46:58,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:47:00,935 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 10:47:01,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:01,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:02,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 10:47:02,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:47:02,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 10:47:02,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 10:47:02,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 10:47:02,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 10:47:04,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:04,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:47:04,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:47:05,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:07,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:47:08,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:08,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:09,698 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.51 vs. limit=22.5 2023-09-30 10:47:13,838 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.31 vs. limit=15.0 2023-09-30 10:47:18,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:47:18,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 10:47:20,062 INFO [train.py:1039] (2/4) Epoch 20, batch 2200, loss[loss=0.1729, simple_loss=0.2517, pruned_loss=0.04702, over 23328.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2509, pruned_loss=0.04946, over 4715437.97 frames. ], batch size: 119, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:47:24,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:47:28,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:30,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:47:30,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:47:32,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:47:35,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:35,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:47:35,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 10:47:40,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 10:47:42,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:47:45,301 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.904e+02 2.106e+02 2.500e+02 4.276e+02, threshold=4.212e+02, percent-clipped=0.0 2023-09-30 10:47:45,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 10:47:45,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=687600.0, ans=0.1 2023-09-30 10:47:48,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:50,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:47:50,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:53,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:47:53,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 10:47:59,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:47:59,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=687666.6666666666, ans=0.1 2023-09-30 10:48:01,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:01,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:48:04,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:48:06,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:09,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:48:11,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:12,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 10:48:14,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:15,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 10:48:18,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:18,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:48:18,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:21,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:48:23,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:23,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:23,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:25,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:48:25,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:48:28,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:48:31,850 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:48:31,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:48:35,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:48:35,652 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 10:48:36,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.54 vs. limit=15.0 2023-09-30 10:48:38,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:48:38,882 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 10:48:40,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:48:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 10:48:42,479 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.92 vs. limit=12.0 2023-09-30 10:48:43,246 INFO [train.py:1039] (2/4) Epoch 20, batch 2250, loss[loss=0.1607, simple_loss=0.2418, pruned_loss=0.03983, over 24322.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2522, pruned_loss=0.04984, over 4716047.38 frames. ], batch size: 61, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:48:44,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:44,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:48:44,895 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.83 vs. limit=15.0 2023-09-30 10:48:45,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:47,159 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 10:48:48,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:48:50,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:48:56,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:48:57,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:49:01,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:01,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=687933.3333333334, ans=0.125 2023-09-30 10:49:03,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:03,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:49:07,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 10:49:07,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:07,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:49:07,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=687933.3333333334, ans=0.125 2023-09-30 10:49:07,572 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=687933.3333333334, ans=0.0 2023-09-30 10:49:08,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 10:49:09,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:49:10,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:11,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:14,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=687933.3333333334, ans=0.0 2023-09-30 10:49:17,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:18,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.62 vs. limit=15.0 2023-09-30 10:49:18,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:49:18,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:49:20,674 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=688000.0, ans=0.125 2023-09-30 10:49:21,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 10:49:21,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:23,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:49:25,423 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=688000.0, ans=0.0 2023-09-30 10:49:28,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:28,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:31,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:49:31,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:35,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:36,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:49:37,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=688066.6666666666, ans=0.125 2023-09-30 10:49:38,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:49:42,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:49:46,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:49:47,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:49:47,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:49:52,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:49:55,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:49:55,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 10:49:55,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:49:56,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:49:59,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 10:50:01,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:50:03,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:06,375 INFO [train.py:1039] (2/4) Epoch 20, batch 2300, loss[loss=0.1631, simple_loss=0.2383, pruned_loss=0.04393, over 23194.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2531, pruned_loss=0.05024, over 4715519.64 frames. ], batch size: 105, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:50:09,249 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.93 vs. limit=15.0 2023-09-30 10:50:10,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:10,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:50:11,720 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 10:50:14,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:21,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:50:21,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:50:23,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:50:23,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:23,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 10:50:25,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:50:26,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:28,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:50:30,983 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.830e+02 1.981e+02 2.237e+02 3.602e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 10:50:32,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:50:34,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:50:37,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:50:42,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:50:43,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:46,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:50:47,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:52,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:54,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:50:54,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:50:54,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 10:50:57,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:50:57,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:00,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:00,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:51:01,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:04,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 10:51:04,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:51:04,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 10:51:04,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:51:04,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:06,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 10:51:14,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:51:16,865 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=688466.6666666666, ans=0.125 2023-09-30 10:51:18,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:51:18,624 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=688466.6666666666, ans=0.1 2023-09-30 10:51:22,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:22,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:51:22,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:51:25,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:51:25,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:51:27,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:51:28,886 INFO [train.py:1039] (2/4) Epoch 20, batch 2350, loss[loss=0.1597, simple_loss=0.2328, pruned_loss=0.04327, over 24442.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2533, pruned_loss=0.05047, over 4720101.13 frames. ], batch size: 58, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:51:28,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 10:51:35,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:51:36,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 10:51:39,640 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=688533.3333333334, ans=0.0 2023-09-30 10:51:43,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 10:51:46,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:49,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:49,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:49,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:51:50,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:51:53,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 10:51:56,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:51:57,147 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.48 vs. limit=15.0 2023-09-30 10:52:01,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 10:52:02,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:52:04,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:52:04,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:52:08,065 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:52:11,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 10:52:11,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:52:13,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:52:13,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:14,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:52:17,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:52:19,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 10:52:20,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:52:24,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:52:24,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:52:26,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 10:52:26,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:52:26,901 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=688733.3333333334, ans=0.0 2023-09-30 10:52:29,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 10:52:29,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:52:34,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 10:52:38,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 10:52:38,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:38,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:52:38,978 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 10:52:41,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 10:52:42,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=688800.0, ans=0.0 2023-09-30 10:52:44,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 10:52:46,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:52:50,549 INFO [train.py:1039] (2/4) Epoch 20, batch 2400, loss[loss=0.1758, simple_loss=0.2353, pruned_loss=0.05817, over 23625.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2529, pruned_loss=0.0506, over 4715427.20 frames. ], batch size: 256, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:52:52,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:52:55,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:52:57,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:52:59,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 10:52:59,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 10:53:08,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:53:08,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:09,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=688933.3333333334, ans=0.0 2023-09-30 10:53:10,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 10:53:11,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:53:11,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:13,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 10:53:14,618 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.889e+02 2.089e+02 2.400e+02 4.035e+02, threshold=4.178e+02, percent-clipped=1.0 2023-09-30 10:53:18,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:22,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 10:53:27,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:53:31,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 10:53:35,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:53:35,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:40,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:53:42,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 10:53:42,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:53:50,212 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:52,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:53:56,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:56,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:53:56,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:53:57,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:53:57,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:57,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:53:57,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:54:01,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:02,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:54:02,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 10:54:04,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 10:54:07,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:54:07,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:54:09,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 10:54:09,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 10:54:09,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 10:54:09,136 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 10:54:10,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 10:54:10,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:54:12,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:12,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:13,143 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=29.77 vs. limit=22.5 2023-09-30 10:54:14,379 INFO [train.py:1039] (2/4) Epoch 20, batch 2450, loss[loss=0.1728, simple_loss=0.2541, pruned_loss=0.04575, over 16121.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2507, pruned_loss=0.05008, over 4708139.33 frames. ], batch size: 34, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:54:14,580 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 10:54:15,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:15,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:54:20,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:54:20,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:24,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:24,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:54:26,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 10:54:28,762 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.11 vs. limit=15.0 2023-09-30 10:54:31,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=689266.6666666666, ans=0.09899494936611666 2023-09-30 10:54:32,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:54:32,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:35,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:54:35,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:54:35,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:54:35,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 10:54:41,043 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=689266.6666666666, ans=0.125 2023-09-30 10:54:42,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:45,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:54:45,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:50,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:54:50,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:53,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 10:54:55,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:55:02,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:05,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:55:05,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:06,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:55:06,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:07,185 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=689400.0, ans=0.0 2023-09-30 10:55:08,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:55:08,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 10:55:10,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=689400.0, ans=0.2 2023-09-30 10:55:13,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:55:13,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:55:16,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:55:16,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:20,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=689466.6666666666, ans=0.1 2023-09-30 10:55:22,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:55:22,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 10:55:25,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:55:25,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:55:25,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 10:55:26,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:55:28,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:55:30,940 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.53 vs. limit=15.0 2023-09-30 10:55:31,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:55:33,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:33,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:55:34,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=689533.3333333334, ans=0.125 2023-09-30 10:55:36,684 INFO [train.py:1039] (2/4) Epoch 20, batch 2500, loss[loss=0.1679, simple_loss=0.2362, pruned_loss=0.04982, over 23805.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2504, pruned_loss=0.04962, over 4717920.08 frames. ], batch size: 179, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:55:36,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 10:55:38,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:55:40,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=689533.3333333334, ans=0.125 2023-09-30 10:55:44,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:46,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=689533.3333333334, ans=0.125 2023-09-30 10:55:52,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=689600.0, ans=0.04949747468305833 2023-09-30 10:55:55,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:55:55,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:56,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:56,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 10:56:01,700 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.945e+02 2.213e+02 2.493e+02 3.500e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-30 10:56:03,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:56:03,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:05,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:56:05,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:56:05,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 10:56:07,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:07,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:08,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 10:56:08,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:10,266 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 10:56:10,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:15,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:56:16,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:20,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:56:20,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 10:56:22,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:56:22,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:26,910 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:31,982 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:35,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:40,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:56:40,641 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=689733.3333333334, ans=0.125 2023-09-30 10:56:43,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 10:56:43,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:43,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:56:46,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:56:46,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:56:48,436 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 10:56:48,437 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 10:56:48,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 10:56:53,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:55,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 10:56:55,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 10:56:56,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:56,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 10:56:56,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=689800.0, ans=0.1 2023-09-30 10:56:58,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=689866.6666666666, ans=0.0 2023-09-30 10:56:59,586 INFO [train.py:1039] (2/4) Epoch 20, batch 2550, loss[loss=0.1832, simple_loss=0.2567, pruned_loss=0.05489, over 23490.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2505, pruned_loss=0.05001, over 4717002.74 frames. ], batch size: 93, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:56:59,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 10:57:03,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.88 vs. limit=22.5 2023-09-30 10:57:04,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:05,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:57:07,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:57:10,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:12,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 10:57:12,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:57:15,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 10:57:16,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:57:19,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:22,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:57:22,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 10:57:23,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:24,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:24,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:27,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:57:27,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 10:57:27,331 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=689933.3333333334, ans=0.0 2023-09-30 10:57:28,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:57:28,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:28,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 10:57:43,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:57:43,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=690000.0, ans=0.1 2023-09-30 10:57:46,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:57:48,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:48,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:49,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:57:55,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:58,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:58,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:57:58,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:58:00,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:58:00,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:58:05,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:05,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:10,389 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=690133.3333333334, ans=0.125 2023-09-30 10:58:11,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:58:11,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 10:58:11,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:58:11,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:13,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:58:14,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:58:15,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:17,424 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.05 vs. limit=10.0 2023-09-30 10:58:20,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=690200.0, ans=0.1 2023-09-30 10:58:21,613 INFO [train.py:1039] (2/4) Epoch 20, batch 2600, loss[loss=0.1814, simple_loss=0.2496, pruned_loss=0.05665, over 23746.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2512, pruned_loss=0.05006, over 4725763.51 frames. ], batch size: 179, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:58:23,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:58:26,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:28,779 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 10:58:30,703 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=690200.0, ans=0.1 2023-09-30 10:58:31,776 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 10:58:31,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:58:31,862 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 10:58:33,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 10:58:33,871 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 10:58:38,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:38,246 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 10:58:40,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 10:58:42,028 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 10:58:43,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:58:45,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 10:58:46,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 10:58:46,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:58:48,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 10:58:49,610 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.870e+02 2.130e+02 2.382e+02 3.027e+02, threshold=4.260e+02, percent-clipped=0.0 2023-09-30 10:58:49,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=690266.6666666666, ans=0.2 2023-09-30 10:58:51,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 10:58:51,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 10:58:53,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=690333.3333333334, ans=0.1 2023-09-30 10:58:54,671 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.07 vs. limit=22.5 2023-09-30 10:58:58,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:58:59,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:59,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:58:59,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 10:59:03,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:59:05,218 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=690333.3333333334, ans=0.0 2023-09-30 10:59:09,243 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 10:59:09,981 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.46 vs. limit=6.0 2023-09-30 10:59:13,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=690400.0, ans=0.125 2023-09-30 10:59:13,256 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=690400.0, ans=0.125 2023-09-30 10:59:15,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:15,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:16,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 10:59:16,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:16,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:59:18,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 10:59:21,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:59:22,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:59:24,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:27,913 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 10:59:29,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:29,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:59:34,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:34,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:59:34,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 10:59:36,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:39,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:59:39,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:59:43,976 INFO [train.py:1039] (2/4) Epoch 20, batch 2650, loss[loss=0.1869, simple_loss=0.2672, pruned_loss=0.05331, over 24456.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2521, pruned_loss=0.05049, over 4722426.27 frames. ], batch size: 66, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:59:44,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 10:59:45,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:49,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:59:54,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 10:59:54,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:54,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:59:57,101 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 10:59:57,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:59:59,524 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=15.0 2023-09-30 11:00:00,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:01,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:00:02,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=690600.0, ans=0.125 2023-09-30 11:00:04,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:00:06,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:00:06,537 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=690600.0, ans=0.0 2023-09-30 11:00:07,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 11:00:07,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:00:07,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:00:10,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 11:00:11,755 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 11:00:16,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:19,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 11:00:19,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:21,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 11:00:22,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:22,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:00:23,001 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:25,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:28,530 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=690666.6666666666, ans=0.07 2023-09-30 11:00:29,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 11:00:29,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 11:00:33,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:00:36,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 11:00:38,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:38,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:38,228 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:00:39,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:39,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:41,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:43,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:44,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:46,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:00:47,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:00:49,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:49,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:00:49,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=690800.0, ans=0.0 2023-09-30 11:00:52,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:52,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:52,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:00:55,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:57,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:00:57,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:57,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 11:01:01,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:03,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:05,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:06,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:06,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:01:08,187 INFO [train.py:1039] (2/4) Epoch 20, batch 2700, loss[loss=0.1792, simple_loss=0.2517, pruned_loss=0.05333, over 24519.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2542, pruned_loss=0.05145, over 4698962.48 frames. ], batch size: 60, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 11:01:08,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:08,490 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=690866.6666666666, ans=0.125 2023-09-30 11:01:11,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:11,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 11:01:12,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:01:14,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:01:16,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:01:16,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:16,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:17,112 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=690866.6666666666, ans=0.125 2023-09-30 11:01:19,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:01:19,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:01:19,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:01:19,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:01:19,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 11:01:21,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:01:22,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:01:24,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:01:24,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:28,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:01:30,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 11:01:30,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:01:35,955 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.889e+02 2.106e+02 2.437e+02 3.198e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 11:01:36,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=690933.3333333334, ans=0.05 2023-09-30 11:01:37,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:01:37,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:01:45,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:01:45,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:45,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:01:46,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:01:47,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:01:51,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:51,407 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:01:51,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:01:54,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff3.min_abs, batch_count=691000.0, ans=0.2 2023-09-30 11:01:56,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:56,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:02:03,121 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=8.74 vs. limit=12.0 2023-09-30 11:02:04,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:02:04,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:08,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:02:08,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:10,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:12,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:12,433 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=691066.6666666666, ans=0.2 2023-09-30 11:02:13,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:02:15,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:16,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:16,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:19,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:02:21,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:21,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:24,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 11:02:26,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:28,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:02:28,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 11:02:29,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 11:02:29,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:30,961 INFO [train.py:1039] (2/4) Epoch 20, batch 2750, loss[loss=0.1838, simple_loss=0.2429, pruned_loss=0.06237, over 23878.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2536, pruned_loss=0.05122, over 4703624.54 frames. ], batch size: 212, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:02:34,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:02:34,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:37,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:39,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:02:39,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:42,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:02:42,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:02:44,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:02:44,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:44,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 11:02:44,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:02:44,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:50,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 11:02:53,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:54,048 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=691266.6666666666, ans=0.0 2023-09-30 11:02:55,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:55,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:55,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:02:57,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:58,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:03:00,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:00,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:03,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:03:03,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:03:04,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:03:05,077 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=691333.3333333334, ans=0.125 2023-09-30 11:03:05,757 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.26 vs. limit=15.0 2023-09-30 11:03:06,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:06,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:03:09,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=691333.3333333334, ans=0.05 2023-09-30 11:03:13,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:14,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:03:14,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:16,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=691333.3333333334, ans=0.125 2023-09-30 11:03:20,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:20,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:03:20,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:03:22,718 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:03:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:03:29,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:03:29,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 11:03:34,931 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.71 vs. limit=12.0 2023-09-30 11:03:35,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:37,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 11:03:42,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:03:45,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:03:45,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 11:03:47,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:03:48,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:03:48,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 11:03:49,198 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=691466.6666666666, ans=0.125 2023-09-30 11:03:50,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:03:51,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=691533.3333333334, ans=0.09899494936611666 2023-09-30 11:03:51,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=691533.3333333334, ans=0.07 2023-09-30 11:03:52,290 INFO [train.py:1039] (2/4) Epoch 20, batch 2800, loss[loss=0.1724, simple_loss=0.2508, pruned_loss=0.04702, over 24325.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2522, pruned_loss=0.05059, over 4720784.96 frames. ], batch size: 61, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:03:52,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=691533.3333333334, ans=0.1 2023-09-30 11:03:53,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:03:53,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:03:53,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:03:55,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 11:03:55,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:57,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:59,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:59,330 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 11:03:59,331 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 11:04:01,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:04,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:04:04,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:04:07,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:04:10,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 11:04:10,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=691600.0, ans=0.2 2023-09-30 11:04:12,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:04:13,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 11:04:13,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:15,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:04:15,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:20,057 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.934e+02 2.198e+02 2.522e+02 3.773e+02, threshold=4.395e+02, percent-clipped=0.0 2023-09-30 11:04:20,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:21,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:21,682 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:04:21,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:04:30,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:04:32,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:32,811 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=691666.6666666666, ans=0.125 2023-09-30 11:04:34,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:34,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=691666.6666666666, ans=0.2 2023-09-30 11:04:35,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:04:36,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:43,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:43,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 11:04:45,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:45,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:45,626 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:04:49,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:50,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:55,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:56,615 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.78 vs. limit=6.0 2023-09-30 11:04:57,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:04:57,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:57,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:04:58,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:05:00,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:05:00,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:05:00,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 11:05:00,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:05:02,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 11:05:04,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:04,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:05:06,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:05:07,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 11:05:09,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=691800.0, ans=0.125 2023-09-30 11:05:10,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=691800.0, ans=0.1 2023-09-30 11:05:15,446 INFO [train.py:1039] (2/4) Epoch 20, batch 2850, loss[loss=0.1891, simple_loss=0.275, pruned_loss=0.05159, over 24568.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2513, pruned_loss=0.05048, over 4709381.19 frames. ], batch size: 71, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:05:15,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:05:15,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:05:16,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:05:17,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=691866.6666666666, ans=0.125 2023-09-30 11:05:20,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:23,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:05:24,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:05:25,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:05:28,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:29,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:05:31,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:05:31,595 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 11:05:38,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 11:05:38,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:40,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 11:05:40,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:43,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 11:05:45,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 11:05:46,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:53,528 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=692000.0, ans=0.1 2023-09-30 11:05:55,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=692000.0, ans=0.1 2023-09-30 11:06:00,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:01,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:01,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:06:01,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:06:03,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:06:03,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:06:06,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:06:06,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 11:06:09,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:06:09,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:09,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:11,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:12,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:12,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:16,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:18,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:19,987 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:06:21,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:21,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:24,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=692133.3333333334, ans=0.0 2023-09-30 11:06:25,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:06:25,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=692133.3333333334, ans=0.125 2023-09-30 11:06:28,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:06:29,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 11:06:30,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 11:06:33,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:06:34,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:34,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 11:06:35,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:06:35,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:36,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:36,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:06:36,533 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 11:06:36,622 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 11:06:36,627 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:37,966 INFO [train.py:1039] (2/4) Epoch 20, batch 2900, loss[loss=0.1853, simple_loss=0.2558, pruned_loss=0.05735, over 23700.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2515, pruned_loss=0.05086, over 4707276.32 frames. ], batch size: 232, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:06:38,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:44,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:06:44,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:44,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:45,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 11:06:49,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:50,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 11:06:50,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 11:06:52,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:06:52,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:06:55,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:55,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:58,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:59,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:07:02,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:07:04,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 11:07:05,483 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.792e+02 1.963e+02 2.139e+02 2.917e+02, threshold=3.926e+02, percent-clipped=0.0 2023-09-30 11:07:05,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:07:07,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:09,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 11:07:11,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 11:07:14,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:07:14,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 11:07:14,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:07:16,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:07:16,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:07:19,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:07:20,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:24,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:07:29,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:07:31,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 11:07:33,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 11:07:33,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:07:33,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=692400.0, ans=0.125 2023-09-30 11:07:36,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:07:39,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 11:07:40,964 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:07:41,294 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=692400.0, ans=0.1 2023-09-30 11:07:46,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:55,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:07:55,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:07:57,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 11:08:00,654 INFO [train.py:1039] (2/4) Epoch 20, batch 2950, loss[loss=0.188, simple_loss=0.2538, pruned_loss=0.06105, over 23384.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2526, pruned_loss=0.05103, over 4720095.37 frames. ], batch size: 285, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:08:00,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:00,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 11:08:02,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:02,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:08:03,185 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.46 vs. limit=22.5 2023-09-30 11:08:03,316 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.71 vs. limit=15.0 2023-09-30 11:08:08,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:09,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 11:08:11,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:11,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:14,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:14,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:08:15,663 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 11:08:15,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 11:08:17,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:08:17,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:20,340 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.02 vs. limit=22.5 2023-09-30 11:08:21,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=692600.0, ans=0.125 2023-09-30 11:08:22,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:24,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:25,921 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=692600.0, ans=0.0 2023-09-30 11:08:27,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:08:28,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:33,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:08:33,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:08:35,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:08:38,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 11:08:42,576 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=692666.6666666666, ans=0.0 2023-09-30 11:08:43,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 11:08:43,934 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 11:08:45,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:08:46,072 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.03 vs. limit=15.0 2023-09-30 11:08:46,933 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 11:08:48,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 11:08:48,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:50,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:50,028 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 11:08:50,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:08:53,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 11:08:55,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:55,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:08:57,173 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=692733.3333333334, ans=0.1 2023-09-30 11:08:58,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:59,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:09:01,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:01,238 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 11:09:01,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:09:01,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 11:09:08,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:10,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:10,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 11:09:10,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:09:13,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 11:09:16,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:16,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:09:16,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:09:18,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:19,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:09:21,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:09:22,916 INFO [train.py:1039] (2/4) Epoch 20, batch 3000, loss[loss=0.1858, simple_loss=0.2619, pruned_loss=0.05487, over 23998.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2536, pruned_loss=0.05142, over 4715050.65 frames. ], batch size: 80, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:09:22,917 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 11:09:35,861 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.3606, 3.7829, 5.2352, 4.9778], device='cuda:2') 2023-09-30 11:09:37,402 INFO [train.py:1071] (2/4) Epoch 20, validation: loss=0.3156, simple_loss=0.2725, pruned_loss=0.1794, over 1125622.00 frames. 2023-09-30 11:09:37,404 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 11:09:37,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:37,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:09:37,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:09:39,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:40,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:09:40,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:40,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 11:09:41,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:43,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:09:45,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:09:48,526 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 11:09:49,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 11:09:51,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:52,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:09:54,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 11:09:54,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:09:55,140 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.22 vs. limit=15.0 2023-09-30 11:10:00,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:10:05,478 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.867e+02 2.114e+02 2.609e+02 3.839e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 11:10:10,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:10:13,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=693000.0, ans=0.125 2023-09-30 11:10:15,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 11:10:18,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:10:21,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:10:22,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:10:23,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:10:25,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:26,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 11:10:27,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 11:10:29,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:10:30,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:10:32,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:10:32,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:33,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:33,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:10:38,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:10:39,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:39,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:10:40,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:42,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 11:10:42,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:10:42,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:10:42,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:10:44,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=693133.3333333334, ans=0.0 2023-09-30 11:10:47,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:47,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:49,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:10:49,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 11:10:49,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:10:51,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 11:10:51,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:10:54,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 11:10:57,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:10:58,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:10:58,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 11:11:00,423 INFO [train.py:1039] (2/4) Epoch 20, batch 3050, loss[loss=0.1834, simple_loss=0.2664, pruned_loss=0.05021, over 24484.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2544, pruned_loss=0.05188, over 4709382.59 frames. ], batch size: 69, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:11:00,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 11:11:00,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:11:01,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:11:03,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:11:03,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:11:03,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:03,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:11:06,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 11:11:08,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:10,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:10,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:11:15,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:18,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 11:11:19,831 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=693266.6666666666, ans=0.0 2023-09-30 11:11:23,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 11:11:23,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 11:11:25,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:28,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:11:30,693 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=693266.6666666666, ans=0.1 2023-09-30 11:11:31,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:31,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:31,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:38,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:11:38,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:11:40,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:40,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:40,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:42,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:45,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:45,382 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=693333.3333333334, ans=0.125 2023-09-30 11:11:46,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:48,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 11:11:48,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:48,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:11:52,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:53,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=693400.0, ans=0.5 2023-09-30 11:11:54,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:11:54,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:11:56,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:00,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:12:00,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:09,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:10,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:10,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:12,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:12,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:12:12,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:12:13,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 11:12:15,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:15,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:17,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 11:12:18,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:23,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:25,031 INFO [train.py:1039] (2/4) Epoch 20, batch 3100, loss[loss=0.1737, simple_loss=0.2605, pruned_loss=0.04344, over 24553.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2545, pruned_loss=0.0522, over 4701702.41 frames. ], batch size: 71, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:12:26,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:12:28,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:12:30,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 11:12:31,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 11:12:33,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 11:12:35,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:12:39,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:12:39,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:41,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=693600.0, ans=0.125 2023-09-30 11:12:42,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:12:47,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:52,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 11:12:53,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=693600.0, ans=0.125 2023-09-30 11:12:55,884 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.886e+02 2.147e+02 2.525e+02 3.564e+02, threshold=4.295e+02, percent-clipped=0.0 2023-09-30 11:12:57,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:12:57,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:57,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:12:59,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:59,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:13:00,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:13:02,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 11:13:02,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:13:03,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:05,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 11:13:06,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:13:07,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=693666.6666666666, ans=0.0 2023-09-30 11:13:12,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:13:14,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 11:13:14,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 11:13:15,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:17,929 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=693733.3333333334, ans=0.1 2023-09-30 11:13:19,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:20,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:20,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:20,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:13:22,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:13:22,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:13:22,684 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=693733.3333333334, ans=0.125 2023-09-30 11:13:24,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:13:24,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:13:24,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:24,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:13:26,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=693733.3333333334, ans=0.025 2023-09-30 11:13:29,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:13:29,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=693733.3333333334, ans=0.09899494936611666 2023-09-30 11:13:30,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 11:13:32,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:13:32,755 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=693800.0, ans=10.0 2023-09-30 11:13:33,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 11:13:34,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=693800.0, ans=0.1 2023-09-30 11:13:35,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:35,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:35,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 11:13:38,768 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:13:47,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 11:13:49,312 INFO [train.py:1039] (2/4) Epoch 20, batch 3150, loss[loss=0.1743, simple_loss=0.2463, pruned_loss=0.05109, over 23308.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2538, pruned_loss=0.0513, over 4704471.92 frames. ], batch size: 93, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:13:51,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:51,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:52,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:13:52,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:13:52,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 11:13:54,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:55,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:13:57,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 11:13:59,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:01,613 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 11:14:04,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 11:14:06,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:06,288 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 11:14:07,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:14:07,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=693933.3333333334, ans=0.125 2023-09-30 11:14:09,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 11:14:09,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 11:14:09,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 11:14:09,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:09,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:09,736 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=693933.3333333334, ans=0.125 2023-09-30 11:14:10,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:13,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 11:14:15,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:15,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:17,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:20,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:14:25,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 11:14:25,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:14:29,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:14:29,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=694000.0, ans=15.0 2023-09-30 11:14:30,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:30,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 11:14:33,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 11:14:34,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=694000.0, ans=0.1 2023-09-30 11:14:35,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:14:35,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:14:35,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:14:36,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:36,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:14:37,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:14:37,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:14:39,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 11:14:40,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:14:40,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:42,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:14:42,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:42,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=694066.6666666666, ans=0.125 2023-09-30 11:14:43,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 11:14:43,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:45,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 11:14:47,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:47,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 11:14:47,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 11:14:50,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:14:50,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:51,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 11:14:52,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 11:14:52,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:56,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:57,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:57,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:15:03,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=694133.3333333334, ans=0.125 2023-09-30 11:15:05,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:15:05,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:08,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 11:15:12,656 INFO [train.py:1039] (2/4) Epoch 20, batch 3200, loss[loss=0.1784, simple_loss=0.244, pruned_loss=0.05641, over 23286.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2524, pruned_loss=0.05093, over 4694906.28 frames. ], batch size: 119, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:15:12,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:15:12,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 11:15:17,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=15.0 2023-09-30 11:15:18,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:20,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:15:20,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 11:15:22,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:15:25,782 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:15:30,818 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:31,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=694266.6666666666, ans=0.09899494936611666 2023-09-30 11:15:37,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=694266.6666666666, ans=0.125 2023-09-30 11:15:39,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:15:39,664 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=694266.6666666666, ans=0.125 2023-09-30 11:15:41,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=694266.6666666666, ans=0.1 2023-09-30 11:15:42,125 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.858e+02 2.103e+02 2.505e+02 4.292e+02, threshold=4.206e+02, percent-clipped=0.0 2023-09-30 11:15:50,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 11:15:51,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:15:54,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 11:15:54,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:15:54,917 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=694333.3333333334, ans=0.0 2023-09-30 11:15:55,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=694333.3333333334, ans=0.125 2023-09-30 11:15:58,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:15:58,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:16:00,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:16:03,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 11:16:05,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=694400.0, ans=0.125 2023-09-30 11:16:06,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:16:08,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 11:16:13,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 11:16:15,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:16:22,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:22,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:16:23,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:23,094 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 11:16:23,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:16:26,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:27,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 11:16:29,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 11:16:30,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 11:16:30,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 11:16:33,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:16:34,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=694533.3333333334, ans=15.0 2023-09-30 11:16:35,052 INFO [train.py:1039] (2/4) Epoch 20, batch 3250, loss[loss=0.1682, simple_loss=0.2562, pruned_loss=0.0401, over 24493.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2519, pruned_loss=0.05081, over 4685061.71 frames. ], batch size: 66, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:16:36,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:16:36,612 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 11:16:36,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:16:36,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:16:39,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 11:16:44,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:16:45,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=694533.3333333334, ans=0.0 2023-09-30 11:16:49,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:16:56,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:16:56,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 11:16:57,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:57,876 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:57,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:16:59,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:16:59,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:17:02,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:02,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:17:02,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:04,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:10,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:11,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:17:14,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:14,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:15,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:15,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:17:15,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:19,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 11:17:21,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:17:21,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:17:22,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:24,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:17:31,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:17:39,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:17:39,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:39,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 11:17:39,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:17:39,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:17:41,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:43,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 11:17:44,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 11:17:45,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:47,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:48,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:48,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:17:48,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:53,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:54,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:17:56,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 11:17:56,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:17:57,608 INFO [train.py:1039] (2/4) Epoch 20, batch 3300, loss[loss=0.1711, simple_loss=0.251, pruned_loss=0.04566, over 24318.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.253, pruned_loss=0.05067, over 4707061.16 frames. ], batch size: 61, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:17:57,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:17:57,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 11:18:01,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:18:01,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 11:18:04,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 11:18:04,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 11:18:06,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:11,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:18:12,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:18:12,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:12,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:18:14,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:18:16,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:17,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:18:21,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=694933.3333333334, ans=0.1 2023-09-30 11:18:24,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 11:18:24,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:24,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:26,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:27,579 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 11:18:27,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:18:29,016 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.888e+02 2.046e+02 2.323e+02 3.227e+02, threshold=4.091e+02, percent-clipped=0.0 2023-09-30 11:18:29,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:18:30,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:18:30,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:18:30,732 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 11:18:37,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:37,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:18:39,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:39,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 11:18:40,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 11:18:41,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:42,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:18:44,028 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 11:18:45,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 11:18:45,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:18:47,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 11:18:50,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:18:52,721 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=695066.6666666666, ans=0.5 2023-09-30 11:18:53,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:18:53,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:18:57,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:57,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:57,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:57,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:19:00,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:19:00,743 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:19:01,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:01,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:19:02,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=695133.3333333334, ans=0.1 2023-09-30 11:19:03,612 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 11:19:03,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 11:19:07,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=695133.3333333334, ans=0.125 2023-09-30 11:19:08,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:19:08,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:08,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:09,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:19:09,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:11,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:19:11,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:12,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:19:12,286 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=695133.3333333334, ans=0.0 2023-09-30 11:19:14,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:15,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:19:18,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 11:19:19,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=695200.0, ans=0.0 2023-09-30 11:19:20,112 INFO [train.py:1039] (2/4) Epoch 20, batch 3350, loss[loss=0.159, simple_loss=0.2383, pruned_loss=0.03983, over 24484.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2538, pruned_loss=0.05066, over 4719543.58 frames. ], batch size: 63, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:19:20,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:21,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:23,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:19:23,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:19:25,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:27,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:27,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:30,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:19:31,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:33,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:19:34,369 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=695200.0, ans=0.125 2023-09-30 11:19:35,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:37,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:19:38,009 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.22 vs. limit=15.0 2023-09-30 11:19:38,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:40,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:19:40,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=695266.6666666666, ans=0.125 2023-09-30 11:19:41,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 11:19:43,373 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 11:19:44,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:48,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 11:19:48,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 11:19:48,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:19:48,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:19:50,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:50,958 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:19:52,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 11:19:52,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:52,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:19:54,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=695333.3333333334, ans=0.125 2023-09-30 11:19:55,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:57,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:57,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:59,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:20:02,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:05,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:06,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:09,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=695400.0, ans=0.125 2023-09-30 11:20:10,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:20:10,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:20:12,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:12,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:15,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:16,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 11:20:16,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:20:16,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 11:20:16,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:20:18,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 11:20:18,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=695400.0, ans=0.2 2023-09-30 11:20:20,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:21,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:30,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:32,044 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 11:20:32,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:20:33,619 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:20:33,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:20:37,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=695466.6666666666, ans=0.125 2023-09-30 11:20:39,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:20:43,110 INFO [train.py:1039] (2/4) Epoch 20, batch 3400, loss[loss=0.1962, simple_loss=0.2611, pruned_loss=0.06571, over 22736.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2549, pruned_loss=0.05133, over 4716053.72 frames. ], batch size: 322, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:20:43,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 11:20:43,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:20:43,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:20:44,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:44,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 11:20:46,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:46,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 11:20:47,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:47,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:49,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:20:49,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:20:50,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 11:20:55,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 11:20:55,846 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 11:20:55,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:00,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:21:00,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:21:02,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:04,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:21:07,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:10,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 11:21:14,107 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.831e+02 2.000e+02 2.171e+02 2.770e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 11:21:15,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:21:18,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:18,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:20,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:21:26,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:21:30,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 11:21:37,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 11:21:37,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:21:38,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:40,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:40,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:21:41,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:48,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:21:48,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:21:53,247 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:21:54,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 11:21:56,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=695800.0, ans=0.0 2023-09-30 11:21:59,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:22:04,515 INFO [train.py:1039] (2/4) Epoch 20, batch 3450, loss[loss=0.1741, simple_loss=0.2538, pruned_loss=0.04722, over 24122.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2552, pruned_loss=0.05145, over 4718958.71 frames. ], batch size: 80, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:22:04,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 11:22:08,032 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=695866.6666666666, ans=0.125 2023-09-30 11:22:08,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=695866.6666666666, ans=0.0 2023-09-30 11:22:11,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 11:22:11,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:13,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:22:13,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 11:22:13,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=695866.6666666666, ans=0.0 2023-09-30 11:22:15,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:22:16,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=695866.6666666666, ans=0.125 2023-09-30 11:22:18,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:22:24,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:22:25,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:26,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:22:26,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:28,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:33,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 11:22:35,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=695933.3333333334, ans=0.1 2023-09-30 11:22:39,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 11:22:41,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:22:41,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:22:43,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:44,034 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.08 vs. limit=22.5 2023-09-30 11:22:50,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 11:22:51,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:22:55,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:22:55,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:57,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:22:58,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:22:58,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=696066.6666666666, ans=0.1 2023-09-30 11:23:00,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 11:23:00,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:03,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:23:03,344 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=696066.6666666666, ans=0.125 2023-09-30 11:23:06,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:07,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 11:23:11,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:23:13,159 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=696133.3333333334, ans=0.125 2023-09-30 11:23:15,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:23:19,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:20,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:25,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:25,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:23:26,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:23:26,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:27,419 INFO [train.py:1039] (2/4) Epoch 20, batch 3500, loss[loss=0.1477, simple_loss=0.2255, pruned_loss=0.03497, over 24316.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.254, pruned_loss=0.05086, over 4730209.15 frames. ], batch size: 56, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:23:31,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:34,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:23:34,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 11:23:37,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:23:40,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:23:44,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:44,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 11:23:48,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:23:48,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:52,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:23:52,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:23:52,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:23:53,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:53,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:23:55,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 11:23:58,756 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.890e+02 2.132e+02 2.452e+02 4.334e+02, threshold=4.264e+02, percent-clipped=1.0 2023-09-30 11:23:58,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:58,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:24:00,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:04,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:05,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 11:24:05,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:24:07,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:07,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=696333.3333333334, ans=0.125 2023-09-30 11:24:10,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:24:10,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:11,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=696333.3333333334, ans=0.2 2023-09-30 11:24:12,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:24:12,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:14,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 11:24:14,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 11:24:15,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 11:24:15,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:17,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:17,907 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:24:18,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:18,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:24:23,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:24:24,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:24:29,186 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-09-30 11:24:30,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:24:30,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 11:24:32,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 11:24:32,111 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:24:35,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:35,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:37,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:39,176 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=696466.6666666666, ans=0.125 2023-09-30 11:24:41,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 11:24:43,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:43,518 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=696466.6666666666, ans=0.1 2023-09-30 11:24:45,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:45,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 11:24:46,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 11:24:48,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:49,901 INFO [train.py:1039] (2/4) Epoch 20, batch 3550, loss[loss=0.1795, simple_loss=0.2627, pruned_loss=0.04811, over 24306.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2523, pruned_loss=0.04997, over 4732670.24 frames. ], batch size: 74, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:24:50,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:51,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:24:51,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:24:56,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:25:00,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=696533.3333333334, ans=0.07 2023-09-30 11:25:05,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:07,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:25:10,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:12,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:25:14,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:14,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:25:14,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:25:17,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:19,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:25:19,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:19,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:25:21,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:25:27,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:25:27,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:28,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:28,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:29,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:25:29,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 11:25:29,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:29,215 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:25:32,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:32,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:25:36,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=696666.6666666666, ans=0.0 2023-09-30 11:25:40,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:40,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:42,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:44,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 11:25:44,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:25:45,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 11:25:47,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:48,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:25:48,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:25:54,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 11:25:56,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:25:56,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=696800.0, ans=0.1 2023-09-30 11:25:59,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:01,029 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 11:26:02,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:05,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:26:07,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 11:26:12,328 INFO [train.py:1039] (2/4) Epoch 20, batch 3600, loss[loss=0.1754, simple_loss=0.2291, pruned_loss=0.06083, over 19425.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2512, pruned_loss=0.04979, over 4722985.84 frames. ], batch size: 388, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:26:12,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=696866.6666666666, ans=0.1 2023-09-30 11:26:14,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 11:26:14,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:26:16,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:26:17,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:17,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:19,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:26:21,201 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=696866.6666666666, ans=0.2 2023-09-30 11:26:22,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:24,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:26,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:26:26,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:26:26,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=696866.6666666666, ans=0.2 2023-09-30 11:26:27,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:27,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 11:26:30,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:26:32,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:34,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:37,337 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=696933.3333333334, ans=0.0 2023-09-30 11:26:38,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:38,854 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:26:40,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:26:40,550 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:41,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 11:26:42,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:43,317 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.766e+02 1.915e+02 2.241e+02 3.227e+02, threshold=3.831e+02, percent-clipped=0.0 2023-09-30 11:26:43,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:44,475 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.10 vs. limit=15.0 2023-09-30 11:26:45,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:26:45,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:47,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:48,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:26:50,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 11:26:55,857 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:26:58,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:27:00,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:27:00,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 11:27:06,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:27:12,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:16,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:23,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:27:23,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:27:23,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 11:27:25,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 11:27:26,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 11:27:28,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:27:28,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:27:31,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 11:27:31,157 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:27:31,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:27:31,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:27:31,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=697133.3333333334, ans=0.125 2023-09-30 11:27:32,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 11:27:33,071 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=697200.0, ans=0.0 2023-09-30 11:27:34,880 INFO [train.py:1039] (2/4) Epoch 20, batch 3650, loss[loss=0.1901, simple_loss=0.2696, pruned_loss=0.0553, over 23925.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2521, pruned_loss=0.04957, over 4730654.61 frames. ], batch size: 86, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:27:34,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 11:27:37,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:38,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 11:27:39,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=697200.0, ans=0.125 2023-09-30 11:27:44,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 11:27:47,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:27:51,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 11:27:53,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 11:27:58,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:27:58,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:27:58,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:28:01,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:28:02,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:28:02,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 11:28:04,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:28:04,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:05,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 11:28:07,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:28:07,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:07,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:09,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:28:12,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 11:28:13,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 11:28:15,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:28:18,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 11:28:21,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:21,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:28:26,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:28:29,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:29,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:28:29,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:28:31,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:28:33,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:28:35,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:36,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:36,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:38,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:28:39,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:39,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:46,619 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 11:28:50,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:50,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:51,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:28:52,581 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.14 vs. limit=15.0 2023-09-30 11:28:53,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:28:54,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:28:56,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:58,364 INFO [train.py:1039] (2/4) Epoch 20, batch 3700, loss[loss=0.1917, simple_loss=0.2525, pruned_loss=0.06547, over 23822.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2528, pruned_loss=0.05012, over 4721420.98 frames. ], batch size: 212, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:28:58,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 11:28:58,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:28:59,290 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.90 vs. limit=22.5 2023-09-30 11:29:00,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:29:01,224 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.75 vs. limit=15.0 2023-09-30 11:29:03,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:29:03,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:29:06,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:06,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 11:29:06,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:29:08,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:29:08,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:29:11,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:29:16,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:29:17,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:19,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:29:19,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:20,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:29:22,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:25,059 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 11:29:31,981 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.879e+02 2.205e+02 2.626e+02 3.954e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-30 11:29:32,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:29:32,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:29:33,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:29:33,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 11:29:33,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:35,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:37,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 11:29:38,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:40,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:29:44,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:44,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:29:45,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:29:48,120 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.62 vs. limit=15.0 2023-09-30 11:29:48,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:48,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 11:29:50,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:50,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 11:29:52,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=697733.3333333334, ans=0.09899494936611666 2023-09-30 11:29:54,475 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=697733.3333333334, ans=0.125 2023-09-30 11:29:55,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:29:56,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:29:58,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=697733.3333333334, ans=0.125 2023-09-30 11:30:00,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:00,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=697733.3333333334, ans=0.125 2023-09-30 11:30:01,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 11:30:03,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:30:03,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:30:03,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:03,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:08,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:09,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 11:30:10,012 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=697800.0, ans=0.0 2023-09-30 11:30:11,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 11:30:11,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:30:11,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:13,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:30:14,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:30:18,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:30:21,113 INFO [train.py:1039] (2/4) Epoch 20, batch 3750, loss[loss=0.1634, simple_loss=0.2459, pruned_loss=0.04048, over 24494.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2545, pruned_loss=0.0512, over 4710933.03 frames. ], batch size: 63, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:30:21,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:30:21,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:30:24,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 11:30:25,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:30:27,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:30:29,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 11:30:29,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:30:31,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:33,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:34,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:30:40,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:43,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:30:43,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:30:45,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=697933.3333333334, ans=0.125 2023-09-30 11:30:46,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:49,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:30:51,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 11:30:51,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:30:52,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:30:54,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:57,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 11:31:02,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 11:31:02,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:31:04,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:31:04,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:11,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:13,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 11:31:16,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 11:31:19,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:21,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=698066.6666666666, ans=0.125 2023-09-30 11:31:23,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:31:23,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:31:26,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:31:28,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=698133.3333333334, ans=0.125 2023-09-30 11:31:28,273 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=698133.3333333334, ans=0.07 2023-09-30 11:31:30,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:31:33,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:31:34,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:31:36,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:31:38,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:31:45,022 INFO [train.py:1039] (2/4) Epoch 20, batch 3800, loss[loss=0.1774, simple_loss=0.2317, pruned_loss=0.06152, over 19350.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2535, pruned_loss=0.05105, over 4704816.84 frames. ], batch size: 388, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:31:47,153 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:31:50,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:52,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:31:53,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 11:31:56,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:57,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:31:59,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:31:59,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=698266.6666666666, ans=0.125 2023-09-30 11:32:00,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 11:32:00,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:01,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:32:03,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:32:03,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:32:04,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:06,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 11:32:06,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=698266.6666666666, ans=0.125 2023-09-30 11:32:08,502 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.82 vs. limit=10.0 2023-09-30 11:32:09,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 11:32:09,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:32:13,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:15,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:32:15,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:32:18,371 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.841e+02 1.992e+02 2.236e+02 3.615e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 11:32:18,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:32:18,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:20,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:20,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:27,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:32:27,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 11:32:27,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=698333.3333333334, ans=0.0 2023-09-30 11:32:28,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:36,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:32:41,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:32:41,986 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=698400.0, ans=0.125 2023-09-30 11:32:46,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 11:32:48,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 11:32:48,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:51,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:53,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:55,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 11:32:58,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 11:32:58,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 11:32:58,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=698466.6666666666, ans=0.0 2023-09-30 11:33:00,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:00,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:33:06,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:33:07,655 INFO [train.py:1039] (2/4) Epoch 20, batch 3850, loss[loss=0.1703, simple_loss=0.2536, pruned_loss=0.0435, over 24645.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2525, pruned_loss=0.05097, over 4706408.28 frames. ], batch size: 65, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:33:07,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:33:12,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:33:14,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 11:33:16,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:33:17,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:21,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:33:23,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:26,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:33:28,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 11:33:34,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:36,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:39,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:39,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:33:41,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:41,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:33:43,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:43,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:33:43,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:44,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:46,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:46,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:33:48,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 11:33:48,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 11:33:49,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:49,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:54,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:33:54,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=698666.6666666666, ans=0.125 2023-09-30 11:33:55,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:55,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 11:33:58,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 11:34:00,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:02,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 11:34:04,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:34:08,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:10,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:34:13,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:13,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 11:34:13,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=698800.0, ans=0.125 2023-09-30 11:34:16,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 11:34:18,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:18,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:21,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:34:21,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:34:23,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:34:23,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 11:34:23,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:34:27,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 11:34:27,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:27,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:28,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:34:29,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:30,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:34:32,492 INFO [train.py:1039] (2/4) Epoch 20, batch 3900, loss[loss=0.1547, simple_loss=0.203, pruned_loss=0.05322, over 19027.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2508, pruned_loss=0.05034, over 4703131.48 frames. ], batch size: 388, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:34:32,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:32,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:32,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:32,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 11:34:32,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:37,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:38,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:38,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:34:40,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:43,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:43,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:46,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:34:46,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 11:34:47,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:34:49,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 11:34:49,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:51,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 11:34:52,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 11:34:58,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:34:59,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:59,809 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:35:01,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:04,762 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.817e+02 2.055e+02 2.286e+02 3.490e+02, threshold=4.109e+02, percent-clipped=0.0 2023-09-30 11:35:06,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:35:08,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:35:10,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:35:10,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:11,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:35:17,163 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.86 vs. limit=15.0 2023-09-30 11:35:17,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:17,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:35:24,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:35:24,765 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=699066.6666666666, ans=0.1 2023-09-30 11:35:25,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:35:37,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:35:41,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:43,372 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 11:35:43,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 11:35:43,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:45,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 11:35:46,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:47,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 11:35:53,934 INFO [train.py:1039] (2/4) Epoch 20, batch 3950, loss[loss=0.1657, simple_loss=0.2464, pruned_loss=0.04251, over 24489.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2513, pruned_loss=0.05003, over 4712424.16 frames. ], batch size: 63, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:35:57,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:57,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 11:35:58,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:36:02,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:36:04,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:36:13,263 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 11:36:14,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:14,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 11:36:14,864 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 11:36:16,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:18,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:18,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:36:18,563 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:22,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 11:36:25,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:36:25,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:25,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:36:25,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:36:26,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:36:27,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.37 vs. limit=22.5 2023-09-30 11:36:35,061 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=699333.3333333334, ans=0.1 2023-09-30 11:36:36,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:36:36,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:36:41,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 11:36:48,193 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 11:36:48,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 11:36:48,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:36:48,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=699400.0, ans=0.2 2023-09-30 11:36:50,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:36:51,063 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.06 vs. limit=10.0 2023-09-30 11:36:55,166 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=699400.0, ans=0.125 2023-09-30 11:36:58,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:36:58,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:36:59,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:59,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:37:00,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 11:37:01,300 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=699466.6666666666, ans=0.0 2023-09-30 11:37:04,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:37:05,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:37:06,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=699466.6666666666, ans=0.07 2023-09-30 11:37:08,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 11:37:15,281 INFO [train.py:1039] (2/4) Epoch 20, batch 4000, loss[loss=0.1934, simple_loss=0.2608, pruned_loss=0.06306, over 22816.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2527, pruned_loss=0.05044, over 4721340.74 frames. ], batch size: 322, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:37:20,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:27,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:29,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=699533.3333333334, ans=0.125 2023-09-30 11:37:33,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:33,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:37:35,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:35,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 11:37:35,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:37:35,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 11:37:35,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:37:35,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 11:37:37,628 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-09-30 11:37:38,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:41,146 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.45 vs. limit=15.0 2023-09-30 11:37:43,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:37:43,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:37:43,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:37:43,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:37:43,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:37:44,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:37:45,014 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 11:37:46,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:37:46,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:37:48,536 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.791e+02 1.984e+02 2.297e+02 3.398e+02, threshold=3.968e+02, percent-clipped=0.0 2023-09-30 11:37:50,244 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 11:37:50,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:37:50,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:37:59,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 11:37:59,303 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:38:00,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:38:02,424 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 11:38:03,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:38:04,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 11:38:04,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:05,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:05,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:38:07,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:38:08,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:38:08,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:38:10,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 11:38:11,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:13,125 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 11:38:19,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:38:22,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:38:25,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:38:26,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:27,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:38:27,619 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=699800.0, ans=0.125 2023-09-30 11:38:27,749 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=699800.0, ans=0.1 2023-09-30 11:38:28,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:38:33,081 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=699800.0, ans=0.125 2023-09-30 11:38:34,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:36,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:38:36,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 11:38:37,994 INFO [train.py:1039] (2/4) Epoch 20, batch 4050, loss[loss=0.1765, simple_loss=0.2622, pruned_loss=0.04539, over 24548.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2534, pruned_loss=0.0505, over 4724470.20 frames. ], batch size: 71, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:38:39,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:38:39,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:38:41,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:38:41,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:38:42,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:47,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:50,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:38:52,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:38:55,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:38:55,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:59,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:02,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:39:03,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 11:39:07,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 11:39:07,388 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 11:39:09,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:39:11,346 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=700000.0, ans=0.0 2023-09-30 11:39:12,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=700000.0, ans=0.2 2023-09-30 11:39:14,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 11:39:15,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:20,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:21,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:23,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:39:23,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:26,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:39:28,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 11:39:28,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:39:31,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:33,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 11:39:39,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:46,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 11:39:47,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:47,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:39:49,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 11:39:49,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 11:39:49,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:39:52,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:39:53,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:39:53,781 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:40:00,018 INFO [train.py:1039] (2/4) Epoch 20, batch 4100, loss[loss=0.1661, simple_loss=0.2449, pruned_loss=0.04363, over 24304.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2544, pruned_loss=0.05093, over 4730840.11 frames. ], batch size: 61, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:40:01,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 11:40:03,367 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 11:40:05,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 11:40:07,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 11:40:07,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:09,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,285 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:40:10,790 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 11:40:14,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:14,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:40:14,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:16,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:40:21,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:40:22,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:22,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:40:22,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 11:40:22,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=700266.6666666666, ans=0.1 2023-09-30 11:40:24,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:24,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:40:25,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:25,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:40:25,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 11:40:28,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 11:40:31,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:40:34,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:34,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 11:40:35,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=700333.3333333334, ans=0.125 2023-09-30 11:40:36,118 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.811e+02 2.046e+02 2.303e+02 3.809e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-30 11:40:36,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:40:37,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:40:37,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:40:41,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 11:40:43,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:40:43,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:40:46,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 11:40:48,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:48,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:40:51,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:53,787 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.69 vs. limit=15.0 2023-09-30 11:40:58,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:02,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:04,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:41:10,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=700466.6666666666, ans=0.125 2023-09-30 11:41:13,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:13,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:41:18,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:18,673 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=700466.6666666666, ans=0.125 2023-09-30 11:41:20,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:41:22,560 INFO [train.py:1039] (2/4) Epoch 20, batch 4150, loss[loss=0.1736, simple_loss=0.2353, pruned_loss=0.05599, over 23589.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2537, pruned_loss=0.05066, over 4726747.36 frames. ], batch size: 135, lr: 5.11e-03, grad_scale: 4.0 2023-09-30 11:41:26,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:41:27,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:41:27,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:41:27,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:30,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 11:41:30,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:32,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 11:41:32,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 11:41:32,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 11:41:34,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:40,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:41:40,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:44,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:41:45,485 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:41:46,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:41:48,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:41:48,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:48,866 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=700600.0, ans=0.2 2023-09-30 11:41:50,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:41:51,903 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=700600.0, ans=0.0 2023-09-30 11:41:55,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:59,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:00,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=700666.6666666666, ans=0.2 2023-09-30 11:42:01,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 11:42:02,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 11:42:02,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:42:04,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 11:42:04,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:42:04,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:07,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=700666.6666666666, ans=0.0 2023-09-30 11:42:09,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:11,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:14,560 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=700733.3333333334, ans=0.2 2023-09-30 11:42:15,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 11:42:18,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:20,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:42:20,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 11:42:20,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:23,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 11:42:25,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:42:26,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:26,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:27,139 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=700800.0, ans=0.0 2023-09-30 11:42:28,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 11:42:28,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:42:29,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:42:31,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:42:34,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 11:42:34,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:34,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:42:34,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:42:36,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 11:42:36,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:36,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:42:36,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:42:38,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=700800.0, ans=0.0 2023-09-30 11:42:39,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:39,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 11:42:41,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:44,263 INFO [train.py:1039] (2/4) Epoch 20, batch 4200, loss[loss=0.1797, simple_loss=0.2432, pruned_loss=0.05814, over 23858.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2526, pruned_loss=0.05076, over 4709542.78 frames. ], batch size: 195, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:42:46,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:42:46,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 11:42:49,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:42:51,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:42:54,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:42:54,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:54,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:56,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=700866.6666666666, ans=0.125 2023-09-30 11:42:57,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 11:42:59,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 11:43:00,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:03,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:05,536 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=700933.3333333334, ans=0.0 2023-09-30 11:43:06,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:43:09,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:43:11,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:12,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:12,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 11:43:12,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:12,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:14,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:43:14,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:43:16,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:43:16,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=701000.0, ans=0.125 2023-09-30 11:43:17,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 11:43:19,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:20,933 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.891e+02 2.083e+02 2.448e+02 3.727e+02, threshold=4.165e+02, percent-clipped=0.0 2023-09-30 11:43:25,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:43:27,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:43:28,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:43:30,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:43:32,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:43:32,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 11:43:32,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:33,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:43:40,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:43:40,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:43,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=701066.6666666666, ans=0.04949747468305833 2023-09-30 11:43:43,856 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=701066.6666666666, ans=0.2 2023-09-30 11:43:46,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:43:49,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 11:43:51,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:56,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:43:56,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=701133.3333333334, ans=0.0 2023-09-30 11:43:58,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:00,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 11:44:05,345 INFO [train.py:1039] (2/4) Epoch 20, batch 4250, loss[loss=0.1567, simple_loss=0.2271, pruned_loss=0.04312, over 23781.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2512, pruned_loss=0.05035, over 4707001.13 frames. ], batch size: 212, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:44:06,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:44:07,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=701200.0, ans=0.1 2023-09-30 11:44:11,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:44:11,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:44:13,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:17,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:44:19,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 11:44:19,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:44:23,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:27,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:27,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=701266.6666666666, ans=0.125 2023-09-30 11:44:29,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:30,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:32,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:44:32,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:44:35,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:36,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:36,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:40,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:44:41,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:44:43,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 11:44:48,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 11:44:48,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:48,725 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=701333.3333333334, ans=0.125 2023-09-30 11:44:50,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:50,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:51,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:44:51,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:51,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:54,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:44:54,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:44:58,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=701400.0, ans=0.0 2023-09-30 11:44:59,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:01,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:03,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 11:45:03,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:45:03,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 11:45:04,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:45:06,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:45:07,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:07,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:45:09,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 11:45:11,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:45:12,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:45:18,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:21,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:23,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:45:23,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:25,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:26,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:45:27,986 INFO [train.py:1039] (2/4) Epoch 20, batch 4300, loss[loss=0.1625, simple_loss=0.2456, pruned_loss=0.03972, over 24678.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2505, pruned_loss=0.04997, over 4698981.88 frames. ], batch size: 65, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:45:28,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:45:28,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 11:45:29,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:34,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:34,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:45:34,606 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:45:39,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:44,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:44,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 11:45:44,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:45:48,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:45:48,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:45:48,592 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 11:45:51,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:45:54,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:45:57,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 11:45:57,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:45:57,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 11:46:00,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:46:03,841 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.784e+02 1.943e+02 2.161e+02 2.799e+02, threshold=3.885e+02, percent-clipped=0.0 2023-09-30 11:46:03,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:46:05,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:46:05,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:46:07,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:46:08,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:10,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:46:10,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 11:46:12,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 11:46:15,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:46:15,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=701733.3333333334, ans=0.0 2023-09-30 11:46:18,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:19,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:46:20,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:20,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:20,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 11:46:20,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 11:46:20,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 11:46:22,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:46:22,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 11:46:24,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 11:46:27,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:29,438 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 11:46:30,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:46:32,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:32,387 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:35,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 11:46:35,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:46:35,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:37,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:46:37,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:37,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:46:37,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=701800.0, ans=0.125 2023-09-30 11:46:40,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:46:43,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:44,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:44,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:49,820 INFO [train.py:1039] (2/4) Epoch 20, batch 4350, loss[loss=0.1531, simple_loss=0.2387, pruned_loss=0.0338, over 24334.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2514, pruned_loss=0.05047, over 4698806.38 frames. ], batch size: 61, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:46:51,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 11:46:51,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:46:57,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:00,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:04,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:47:04,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:47:06,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=701933.3333333334, ans=0.125 2023-09-30 11:47:08,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:47:13,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:14,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:47:15,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:19,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:47:21,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:47:22,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:47:30,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 11:47:30,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:32,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:36,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:39,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 11:47:42,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:44,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:47:47,371 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 11:47:48,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:48,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:47:50,388 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 11:47:51,822 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 11:47:51,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:51,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:53,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:47:54,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:56,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:56,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:58,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 11:47:58,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:58,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:58,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:00,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 11:48:01,947 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 11:48:01,954 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 11:48:01,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 11:48:07,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:48:09,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:48:09,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:09,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:48:10,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 11:48:13,803 INFO [train.py:1039] (2/4) Epoch 20, batch 4400, loss[loss=0.1501, simple_loss=0.2249, pruned_loss=0.03764, over 24273.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2522, pruned_loss=0.05088, over 4716034.06 frames. ], batch size: 56, lr: 5.10e-03, grad_scale: 16.0 2023-09-30 11:48:13,962 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 11:48:13,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:18,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:18,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:20,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:48:23,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 11:48:23,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 11:48:23,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 11:48:23,358 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 11:48:24,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:48:24,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:27,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 11:48:29,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:31,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:31,045 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 11:48:32,973 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=702266.6666666666, ans=0.1 2023-09-30 11:48:34,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:34,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 11:48:34,964 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 11:48:36,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 11:48:38,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 11:48:38,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 11:48:38,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:40,992 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:42,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:42,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:48:43,326 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.74 vs. limit=15.0 2023-09-30 11:48:44,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=702266.6666666666, ans=0.125 2023-09-30 11:48:45,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 11:48:45,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 11:48:47,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:49,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:48:49,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:49,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:50,488 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.844e+02 2.019e+02 2.293e+02 3.220e+02, threshold=4.037e+02, percent-clipped=0.0 2023-09-30 11:48:50,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:50,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 11:48:52,213 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 11:48:55,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:01,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:49:03,328 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=702400.0, ans=0.2 2023-09-30 11:49:04,609 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 11:49:09,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:49:12,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:14,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:49:14,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 11:49:16,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:49:16,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:49:16,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:49:18,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:49:23,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 11:49:27,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 11:49:29,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 11:49:29,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:49:29,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 11:49:30,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:49:33,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:49:35,352 INFO [train.py:1039] (2/4) Epoch 20, batch 4450, loss[loss=0.1868, simple_loss=0.2682, pruned_loss=0.05272, over 23972.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2537, pruned_loss=0.05128, over 4708087.05 frames. ], batch size: 86, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:49:35,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 11:49:38,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=702533.3333333334, ans=0.0 2023-09-30 11:49:40,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:43,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:43,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:49:45,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=15.0 2023-09-30 11:49:50,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:49:50,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:49:54,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:57,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:49:58,965 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=702600.0, ans=0.125 2023-09-30 11:50:01,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:50:01,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:01,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 11:50:01,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:02,037 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=702600.0, ans=0.05 2023-09-30 11:50:03,388 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:03,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:03,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:50:06,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:50:09,016 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=702666.6666666666, ans=0.125 2023-09-30 11:50:10,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:10,191 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:11,706 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:11,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:13,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:50:17,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:50:18,522 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 11:50:19,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 11:50:19,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:50:19,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=702666.6666666666, ans=0.2 2023-09-30 11:50:20,891 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=702666.6666666666, ans=0.125 2023-09-30 11:50:22,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:22,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 11:50:29,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:50:32,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:33,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 11:50:33,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:33,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:33,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:50:33,978 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:36,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:41,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:50:41,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 11:50:43,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:50:44,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:46,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:47,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:47,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:50:50,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:50:54,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 11:50:55,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:50:57,895 INFO [train.py:1039] (2/4) Epoch 20, batch 4500, loss[loss=0.1748, simple_loss=0.2599, pruned_loss=0.04482, over 24692.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2541, pruned_loss=0.05109, over 4708578.04 frames. ], batch size: 68, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:51:03,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:04,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 11:51:04,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 11:51:05,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:10,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:51:10,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:11,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:51:11,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:51:11,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:13,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:21,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=702933.3333333334, ans=0.125 2023-09-30 11:51:24,656 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=702933.3333333334, ans=0.125 2023-09-30 11:51:25,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:25,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:51:31,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:31,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:51:33,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:51:36,931 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.958e+02 2.176e+02 2.653e+02 3.969e+02, threshold=4.352e+02, percent-clipped=0.0 2023-09-30 11:51:38,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:51:41,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:51:45,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:51:48,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:51:48,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 11:51:48,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:51:49,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:50,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:51,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:55,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:55,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 11:51:55,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:51:55,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:00,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:52:00,453 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:52:06,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:06,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:52:08,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:52:09,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 11:52:12,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 11:52:12,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 11:52:16,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 11:52:16,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 11:52:17,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:20,723 INFO [train.py:1039] (2/4) Epoch 20, batch 4550, loss[loss=0.1701, simple_loss=0.2595, pruned_loss=0.04037, over 24654.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2533, pruned_loss=0.0512, over 4700233.70 frames. ], batch size: 68, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:52:22,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:22,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:24,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:24,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=703200.0, ans=0.125 2023-09-30 11:52:28,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:52:30,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:52:34,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:52:34,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:52:34,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:37,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:38,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:41,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:52:44,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 11:52:45,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 11:52:47,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:52:48,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 11:52:51,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 11:52:52,569 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.70 vs. limit=15.0 2023-09-30 11:52:53,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:52:56,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 11:52:58,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:52:58,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=703333.3333333334, ans=0.0 2023-09-30 11:53:02,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,946 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:53:05,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 11:53:09,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:11,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:11,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:53:11,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=703400.0, ans=0.125 2023-09-30 11:53:13,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:14,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 11:53:14,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 11:53:14,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:53:16,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 11:53:19,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 11:53:19,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:20,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:20,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:22,352 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=703400.0, ans=0.125 2023-09-30 11:53:23,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:23,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:53:25,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:53:25,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 11:53:26,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:26,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:53:26,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 11:53:27,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=703466.6666666666, ans=0.125 2023-09-30 11:53:28,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:53:28,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 11:53:31,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:53:31,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:53:35,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:53:35,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:35,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:53:38,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:53:38,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:53:40,353 INFO [train.py:1039] (2/4) Epoch 20, batch 4600, loss[loss=0.1927, simple_loss=0.2739, pruned_loss=0.05572, over 23953.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.252, pruned_loss=0.05042, over 4706239.04 frames. ], batch size: 80, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:53:40,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:42,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:42,987 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=703533.3333333334, ans=0.125 2023-09-30 11:53:46,956 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:53:46,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:53:48,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:49,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 11:53:50,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=703533.3333333334, ans=0.2 2023-09-30 11:53:51,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:53:54,625 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=703533.3333333334, ans=0.5 2023-09-30 11:53:56,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:53:56,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:57,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:03,395 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.35 vs. limit=15.0 2023-09-30 11:54:04,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 11:54:05,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:08,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:11,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:54:11,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:18,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 11:54:18,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:54:19,794 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.438e+02 1.811e+02 2.166e+02 2.771e+02 4.574e+02, threshold=4.333e+02, percent-clipped=1.0 2023-09-30 11:54:19,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:54:27,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:27,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:54:29,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:54:33,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 11:54:35,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:54:39,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:41,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:54:41,629 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=703733.3333333334, ans=0.125 2023-09-30 11:54:42,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:42,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 11:54:42,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:44,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 11:54:44,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:45,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:47,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:47,431 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:48,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:49,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 11:54:51,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 11:54:51,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 11:54:51,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:53,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:54:53,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:55,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:55:01,773 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=703866.6666666666, ans=10.0 2023-09-30 11:55:01,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=703866.6666666666, ans=0.125 2023-09-30 11:55:02,994 INFO [train.py:1039] (2/4) Epoch 20, batch 4650, loss[loss=0.1963, simple_loss=0.2516, pruned_loss=0.07051, over 19603.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2511, pruned_loss=0.0503, over 4691162.95 frames. ], batch size: 388, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:55:06,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:55:09,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:09,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:09,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:55:09,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:55:09,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:12,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:15,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 11:55:18,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:55:20,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 11:55:21,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:21,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 11:55:23,229 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:55:23,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 11:55:23,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 11:55:25,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:26,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:55:28,608 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:55:30,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:30,273 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 11:55:31,042 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.93 vs. limit=15.0 2023-09-30 11:55:34,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:35,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 11:55:39,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:39,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:55:39,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 11:55:40,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:55:43,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:55:46,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:52,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:54,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:56,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:56,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:56:01,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 11:56:01,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 11:56:01,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 11:56:01,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 11:56:03,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:03,993 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.12 vs. limit=15.0 2023-09-30 11:56:05,506 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:56:10,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:56:10,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:10,130 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 11:56:10,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:12,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:13,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:56:14,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:56:17,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:56:17,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:19,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:56:21,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=704133.3333333334, ans=0.07 2023-09-30 11:56:22,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:22,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:56:22,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:56:22,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 11:56:23,857 INFO [train.py:1039] (2/4) Epoch 20, batch 4700, loss[loss=0.1639, simple_loss=0.2403, pruned_loss=0.04374, over 24451.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2514, pruned_loss=0.05037, over 4693140.03 frames. ], batch size: 58, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:56:24,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:56:25,747 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=704200.0, ans=0.2 2023-09-30 11:56:27,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 11:56:36,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:36,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:37,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:56:39,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:41,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:56:44,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 11:56:44,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 11:56:48,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:48,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:56:50,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:56:53,692 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=704266.6666666666, ans=0.0 2023-09-30 11:56:54,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:57:00,546 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.959e+02 2.502e+02 2.880e+02 4.077e+02, threshold=5.005e+02, percent-clipped=0.0 2023-09-30 11:57:00,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:57:02,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:57:05,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:57:11,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 11:57:12,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:57:16,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:19,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 11:57:21,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:57:24,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:57:24,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 11:57:27,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:27,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:30,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:57:30,420 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:57:30,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 11:57:31,938 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 11:57:33,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:35,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:35,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:36,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 11:57:36,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:42,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 11:57:45,973 INFO [train.py:1039] (2/4) Epoch 20, batch 4750, loss[loss=0.1972, simple_loss=0.2741, pruned_loss=0.0601, over 24352.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2519, pruned_loss=0.05023, over 4698191.61 frames. ], batch size: 77, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:57:46,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:57:46,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:48,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=704533.3333333334, ans=0.125 2023-09-30 11:57:51,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:51,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:57:55,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 11:57:55,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:00,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 11:58:01,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:58:01,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:03,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:05,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=704600.0, ans=0.125 2023-09-30 11:58:09,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 11:58:14,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:58:16,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 11:58:16,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:19,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:21,474 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 11:58:21,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 11:58:24,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 11:58:28,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:28,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:31,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:58:31,585 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 11:58:32,193 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-09-30 11:58:32,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:34,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:58:37,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:58:39,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 11:58:39,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 11:58:39,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:39,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=704733.3333333334, ans=0.0 2023-09-30 11:58:40,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:58:40,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:42,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:58:43,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 11:58:48,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 11:58:50,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:58:53,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:53,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 11:58:53,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:56,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:57,204 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=704800.0, ans=0.0 2023-09-30 11:58:58,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:58:58,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:59,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:59:03,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:03,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 11:59:03,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 11:59:05,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 11:59:05,785 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.99 vs. limit=15.0 2023-09-30 11:59:08,054 INFO [train.py:1039] (2/4) Epoch 20, batch 4800, loss[loss=0.17, simple_loss=0.2494, pruned_loss=0.04524, over 24346.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2536, pruned_loss=0.05114, over 4687262.24 frames. ], batch size: 61, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 11:59:08,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:59:09,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:09,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 11:59:15,852 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:17,216 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:23,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:59:23,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:24,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:25,133 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=704933.3333333334, ans=0.0 2023-09-30 11:59:26,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 11:59:26,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:59:26,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:59:28,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:59:33,252 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:59:34,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:34,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:59:36,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:36,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:59:36,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:38,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:38,657 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=704933.3333333334, ans=0.125 2023-09-30 11:59:41,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:42,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:43,268 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=705000.0, ans=0.0 2023-09-30 11:59:44,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:44,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:59:45,726 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.894e+02 2.052e+02 2.398e+02 3.146e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 11:59:45,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:59:46,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:48,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 11:59:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 11:59:50,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:50,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:59:52,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:59:52,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:52,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:59:53,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:59:55,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:59,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:04,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:05,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:12,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 12:00:12,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:12,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:12,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:00:14,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:17,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=705133.3333333334, ans=0.0 2023-09-30 12:00:18,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:20,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:00:20,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:20,841 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.87 vs. limit=22.5 2023-09-30 12:00:21,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:00:21,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:00:23,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:00:26,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:26,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:26,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:27,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 12:00:29,404 INFO [train.py:1039] (2/4) Epoch 20, batch 4850, loss[loss=0.169, simple_loss=0.249, pruned_loss=0.04448, over 24664.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2533, pruned_loss=0.05074, over 4707125.09 frames. ], batch size: 65, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:00:29,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 12:00:29,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:29,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:33,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:00:33,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:35,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:45,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 12:00:47,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:51,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:00:51,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:00:52,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:55,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:55,303 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=705266.6666666666, ans=0.2 2023-09-30 12:00:57,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:00:59,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:00:59,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 12:01:01,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:01:05,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:01:05,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:01:06,713 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:01:06,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 12:01:10,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:01:10,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:14,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:15,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 12:01:15,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 12:01:16,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:01:22,258 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=705400.0, ans=0.1 2023-09-30 12:01:23,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:01:23,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 12:01:24,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:01:24,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:01:26,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:01:26,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 12:01:26,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:28,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 12:01:28,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:29,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:31,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 12:01:39,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:42,507 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=705466.6666666666, ans=0.125 2023-09-30 12:01:45,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:01:45,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:01:45,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=705466.6666666666, ans=0.0 2023-09-30 12:01:49,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 12:01:49,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:01:51,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=705533.3333333334, ans=0.0 2023-09-30 12:01:53,036 INFO [train.py:1039] (2/4) Epoch 20, batch 4900, loss[loss=0.1599, simple_loss=0.2309, pruned_loss=0.04443, over 23397.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2524, pruned_loss=0.05048, over 4694945.00 frames. ], batch size: 134, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:01:56,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:58,364 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.91 vs. limit=22.5 2023-09-30 12:01:58,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:58,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:02:02,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 12:02:08,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 12:02:12,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 12:02:13,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 12:02:13,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:13,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:02:15,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:02:15,419 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:15,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:02:15,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 12:02:18,392 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.50 vs. limit=15.0 2023-09-30 12:02:20,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 12:02:20,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:02:23,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:02:25,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:28,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:02:28,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:29,516 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.910e+02 2.109e+02 2.496e+02 4.455e+02, threshold=4.218e+02, percent-clipped=1.0 2023-09-30 12:02:31,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:31,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 12:02:33,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:02:33,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:33,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 12:02:33,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 12:02:39,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 12:02:41,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:02:42,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:02:42,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:02:44,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:44,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:02:44,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:02:44,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 12:02:49,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:51,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:02:52,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:02:56,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 12:02:56,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:02:56,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:02:56,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 12:03:03,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:05,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:06,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 12:03:06,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:06,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:03:07,657 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-09-30 12:03:09,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:14,407 INFO [train.py:1039] (2/4) Epoch 20, batch 4950, loss[loss=0.1765, simple_loss=0.2438, pruned_loss=0.05459, over 23718.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2504, pruned_loss=0.05006, over 4693531.53 frames. ], batch size: 232, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:03:14,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:14,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:03:15,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:15,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 12:03:17,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:03:20,151 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=705866.6666666666, ans=0.0 2023-09-30 12:03:21,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:21,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:24,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 12:03:24,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 12:03:24,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:03:24,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 12:03:26,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:26,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:03:26,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:03:26,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:29,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:29,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:03:31,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:03:32,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:34,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:34,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:36,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=705933.3333333334, ans=0.125 2023-09-30 12:03:37,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:03:43,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:43,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=705933.3333333334, ans=0.5 2023-09-30 12:03:46,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:47,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:49,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:51,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:03:52,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 12:03:52,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 12:03:56,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:58,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:03:58,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:03:59,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:03:59,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:04:01,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:04:04,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:06,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:04:09,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:04:10,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:10,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:11,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=706066.6666666666, ans=0.125 2023-09-30 12:04:12,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 12:04:12,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:04:12,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:04:17,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:04:18,200 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=706066.6666666666, ans=0.0 2023-09-30 12:04:19,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:04:19,323 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:04:20,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:20,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:04:22,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:04:23,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:04:23,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:04:23,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:26,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 12:04:34,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:37,587 INFO [train.py:1039] (2/4) Epoch 20, batch 5000, loss[loss=0.1867, simple_loss=0.2779, pruned_loss=0.04773, over 24466.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2489, pruned_loss=0.04943, over 4698889.08 frames. ], batch size: 69, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:04:39,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 12:04:39,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:04:41,266 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:04:47,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:47,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:04:47,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 12:04:47,540 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=706200.0, ans=0.1 2023-09-30 12:04:48,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 12:04:50,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:04:52,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 12:04:53,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:04:53,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:04:54,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 12:04:55,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:55,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:04:57,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 12:04:57,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:57,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:04:59,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 12:05:00,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 12:05:01,439 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.01 vs. limit=12.0 2023-09-30 12:05:02,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:05:02,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 12:05:02,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:05:02,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:03,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:05:03,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 12:05:03,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 12:05:03,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 12:05:05,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:06,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:07,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 12:05:09,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:05:11,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:11,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:05:11,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=706333.3333333334, ans=0.0 2023-09-30 12:05:12,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:05:14,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 12:05:14,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:05:15,914 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.770e+02 2.000e+02 2.327e+02 3.746e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 12:05:16,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:05:19,382 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 12:05:19,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=706333.3333333334, ans=0.125 2023-09-30 12:05:21,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=706333.3333333334, ans=0.0 2023-09-30 12:05:23,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:05:24,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:24,541 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:24,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=706333.3333333334, ans=0.0 2023-09-30 12:05:28,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=706400.0, ans=0.0 2023-09-30 12:05:29,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 12:05:29,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:29,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:30,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:05:32,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 12:05:32,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:35,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:37,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:05:43,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 12:05:47,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:50,358 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=15.0 2023-09-30 12:05:57,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:59,568 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:00,958 INFO [train.py:1039] (2/4) Epoch 20, batch 5050, loss[loss=0.178, simple_loss=0.2645, pruned_loss=0.04571, over 24312.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2504, pruned_loss=0.04982, over 4700551.77 frames. ], batch size: 74, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:06:01,018 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:06:01,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:01,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:06:01,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:06:01,240 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:06,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:06,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 12:06:07,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:06:09,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:11,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:06:12,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 12:06:13,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=706533.3333333334, ans=0.125 2023-09-30 12:06:15,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:15,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:06:16,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:06:18,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:06:19,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:06:30,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 12:06:30,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:06:31,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:31,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 12:06:34,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:06:37,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:37,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:37,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:06:37,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 12:06:38,950 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 12:06:39,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:40,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:06:43,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:45,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 12:06:47,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:06:49,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 12:06:51,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:06:53,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:06:53,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:06:53,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:55,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:06:58,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:06:58,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:59,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:59,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:06:59,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 12:06:59,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:07:01,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:07:05,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:07:05,973 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 12:07:06,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:07:07,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:09,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:09,548 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 12:07:13,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:13,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 12:07:13,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:17,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:18,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:18,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 12:07:20,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 12:07:24,211 INFO [train.py:1039] (2/4) Epoch 20, batch 5100, loss[loss=0.1826, simple_loss=0.2741, pruned_loss=0.04561, over 24561.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2512, pruned_loss=0.04982, over 4696986.72 frames. ], batch size: 71, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:07:24,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:24,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:24,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:07:27,882 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 12:07:30,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:33,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 12:07:33,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 12:07:35,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:36,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:07:40,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:40,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 12:07:40,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 12:07:46,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:46,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:07:51,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:53,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=706933.3333333334, ans=0.0 2023-09-30 12:07:54,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 12:07:54,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:58,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:58,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 12:08:00,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,551 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.819e+02 2.005e+02 2.241e+02 3.147e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 12:08:01,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 12:08:03,924 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 12:08:05,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:05,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 12:08:05,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 12:08:10,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:08:11,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=707066.6666666666, ans=0.125 2023-09-30 12:08:18,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:20,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 12:08:21,536 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 12:08:21,560 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 12:08:23,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 12:08:23,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:24,968 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=707066.6666666666, ans=0.0 2023-09-30 12:08:26,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 12:08:30,124 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 12:08:33,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 12:08:35,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:08:36,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 12:08:40,325 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:08:40,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 12:08:43,659 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:08:46,125 INFO [train.py:1039] (2/4) Epoch 20, batch 5150, loss[loss=0.1811, simple_loss=0.251, pruned_loss=0.0556, over 23589.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2516, pruned_loss=0.05015, over 4710606.64 frames. ], batch size: 120, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:08:46,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:08:47,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:08:47,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:08:47,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:08:47,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:08:49,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:08:51,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 12:08:51,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 12:08:51,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 12:08:51,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:08:51,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 12:08:53,351 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:53,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:08:56,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:08:57,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:08:58,240 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=707200.0, ans=0.0 2023-09-30 12:09:03,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:09:03,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 12:09:04,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:04,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:09:07,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:09:07,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:07,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:07,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:09:07,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:09:10,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 12:09:11,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:09:11,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:13,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=707266.6666666666, ans=0.95 2023-09-30 12:09:15,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:09:16,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 12:09:18,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:09:23,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:09:24,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 12:09:29,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:35,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.63 vs. limit=22.5 2023-09-30 12:09:36,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:37,600 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.81 vs. limit=22.5 2023-09-30 12:09:38,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:42,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:44,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:47,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 12:09:49,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:09:51,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:09:51,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:56,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:56,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:56,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=707466.6666666666, ans=0.1 2023-09-30 12:09:59,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 12:09:59,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=707466.6666666666, ans=0.0 2023-09-30 12:10:03,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:04,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:10:08,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:10:08,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:10:09,492 INFO [train.py:1039] (2/4) Epoch 20, batch 5200, loss[loss=0.1986, simple_loss=0.2614, pruned_loss=0.06789, over 23846.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2529, pruned_loss=0.05117, over 4692499.14 frames. ], batch size: 195, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:10:09,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:10:11,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:10:11,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:10:11,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:10:11,768 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.93 vs. limit=15.0 2023-09-30 12:10:14,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:10:14,402 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=707533.3333333334, ans=0.125 2023-09-30 12:10:16,026 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=707533.3333333334, ans=0.2 2023-09-30 12:10:17,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:10:18,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:23,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 12:10:24,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:10:25,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:28,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:30,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:10:30,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:31,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 12:10:33,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:10:33,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:36,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 12:10:38,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:10:38,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:10:40,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 12:10:40,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 12:10:44,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 12:10:46,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:46,109 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 12:10:46,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:46,986 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.67 vs. limit=22.5 2023-09-30 12:10:47,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:47,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:10:48,963 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.872e+02 2.079e+02 2.395e+02 3.722e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-30 12:10:49,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 12:10:50,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:10:52,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:54,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 12:10:55,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 12:10:55,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 12:11:01,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 12:11:01,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:11:05,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:11:06,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:07,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 12:11:09,112 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:11:09,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:11:09,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:10,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:14,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:14,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:11:14,954 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=707800.0, ans=0.2 2023-09-30 12:11:16,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=707800.0, ans=0.0 2023-09-30 12:11:18,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:11:19,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:19,857 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:25,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:25,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 12:11:26,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:26,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:11:30,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:30,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:11:31,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:11:33,434 INFO [train.py:1039] (2/4) Epoch 20, batch 5250, loss[loss=0.1667, simple_loss=0.232, pruned_loss=0.05072, over 23820.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2521, pruned_loss=0.0511, over 4676088.88 frames. ], batch size: 195, lr: 5.08e-03, grad_scale: 16.0 2023-09-30 12:11:35,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:11:38,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:38,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:11:38,996 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.16 vs. limit=15.0 2023-09-30 12:11:39,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:11:47,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:49,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:11:50,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:11:52,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:52,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=707933.3333333334, ans=0.02 2023-09-30 12:11:54,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 12:11:54,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:56,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:12:47,652 INFO [train.py:1039] (2/4) Epoch 20, batch 5300, loss[loss=0.1778, simple_loss=0.2669, pruned_loss=0.04433, over 24655.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2514, pruned_loss=0.05076, over 4689458.78 frames. ], batch size: 68, lr: 5.08e-03, grad_scale: 8.0 2023-09-30 12:12:50,705 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=708200.0, ans=0.125 2023-09-30 12:12:54,960 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=708200.0, ans=0.125 2023-09-30 12:13:01,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=708266.6666666666, ans=0.125 2023-09-30 12:13:02,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:13:02,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 12:13:02,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 12:13:02,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:03,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:03,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:03,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:03,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:03,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:03,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:03,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:13:04,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:13:04,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 12:13:04,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 12:13:04,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 12:13:04,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:13:04,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 12:13:04,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 12:13:05,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:05,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:05,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:06,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:06,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:13:06,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:06,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:06,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:06,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:06,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:06,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:13:07,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:07,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:13:07,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 12:13:07,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:08,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:08,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 12:13:08,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 12:13:08,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:13:08,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:08,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 12:13:09,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 12:13:09,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:10,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:13:10,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:10,648 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 12:13:10,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 12:13:10,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:13:10,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:11,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 12:13:11,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 12:13:11,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 12:13:11,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:17,111 INFO [train.py:1039] (2/4) Epoch 21, batch 0, loss[loss=0.1745, simple_loss=0.2574, pruned_loss=0.04576, over 24649.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2574, pruned_loss=0.04576, over 24649.00 frames. ], batch size: 68, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:13:17,112 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 12:13:30,284 INFO [train.py:1071] (2/4) Epoch 21, validation: loss=0.2775, simple_loss=0.2715, pruned_loss=0.1418, over 1125622.00 frames. 2023-09-30 12:13:30,285 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 12:13:34,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 12:13:34,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:13:37,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:13:40,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:41,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:13:42,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:42,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 12:13:42,450 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:13:43,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 12:13:47,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:47,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:50,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:50,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:50,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:13:52,056 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.851e+02 2.011e+02 2.315e+02 3.678e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:13:52,198 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:13:53,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 12:13:56,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:14:05,047 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:14:05,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:07,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 12:14:12,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:14:12,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:14:15,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:19,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:14:19,722 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=708480.0, ans=0.0 2023-09-30 12:14:22,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:27,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 12:14:30,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 12:14:30,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:30,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:32,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:14:32,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:33,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 12:14:37,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:37,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:41,887 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.65 vs. limit=15.0 2023-09-30 12:14:42,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:14:43,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=708546.6666666666, ans=0.09899494936611666 2023-09-30 12:14:45,896 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 12:14:47,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:14:51,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:14:52,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:53,951 INFO [train.py:1039] (2/4) Epoch 21, batch 50, loss[loss=0.1451, simple_loss=0.2212, pruned_loss=0.03453, over 24421.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2541, pruned_loss=0.04966, over 1073440.32 frames. ], batch size: 58, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:14:54,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 12:14:54,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:14:54,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:14:57,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:14:57,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:14:57,599 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=708613.3333333334, ans=0.125 2023-09-30 12:15:00,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:15:00,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=708613.3333333334, ans=0.125 2023-09-30 12:15:04,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 12:15:04,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:12,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:15:13,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 12:15:15,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 12:15:17,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:15:18,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:18,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:20,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:15:20,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:15:21,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:15:21,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:29,399 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.05 vs. limit=12.0 2023-09-30 12:15:32,488 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.24 vs. limit=15.0 2023-09-30 12:15:33,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:34,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:15:34,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:15:35,277 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=708746.6666666666, ans=0.125 2023-09-30 12:15:36,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 12:15:37,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:15:39,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:15:39,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 12:15:40,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:15:42,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 12:15:49,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:15:49,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:51,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:51,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:51,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:15:56,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 12:15:56,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 12:15:59,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:59,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:16:00,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:16:02,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:16:02,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 12:16:04,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 12:16:04,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:16:05,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:07,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:16:08,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 12:16:08,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 12:16:10,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:11,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:13,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:16:13,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:16:14,846 INFO [train.py:1039] (2/4) Epoch 21, batch 100, loss[loss=0.1936, simple_loss=0.2716, pruned_loss=0.05782, over 23967.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2554, pruned_loss=0.04983, over 1895224.37 frames. ], batch size: 86, lr: 4.96e-03, grad_scale: 8.0 2023-09-30 12:16:16,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:16:18,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:16:18,797 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.86 vs. limit=15.0 2023-09-30 12:16:22,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:24,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 12:16:24,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:16:30,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:16:30,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:30,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:30,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:16:30,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:33,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 12:16:35,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:16:36,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:36,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:36,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:38,921 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.766e+02 1.906e+02 2.268e+02 3.553e+02, threshold=3.812e+02, percent-clipped=0.0 2023-09-30 12:16:40,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 12:16:42,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:42,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:42,435 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:16:45,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:16:48,686 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 12:16:48,711 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 12:16:50,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:16:50,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:16:54,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:16:56,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:58,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:03,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:05,290 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 12:17:06,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:17:08,620 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=709146.6666666666, ans=0.5 2023-09-30 12:17:09,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:12,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:12,463 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=709146.6666666666, ans=0.0 2023-09-30 12:17:13,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:16,598 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:18,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:19,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:17:21,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:22,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:24,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:24,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:17:24,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:24,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 12:17:24,363 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 12:17:24,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:25,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:17:26,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:26,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:26,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 12:17:26,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:17:27,590 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:17:27,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:29,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:30,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:31,730 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=709213.3333333334, ans=0.125 2023-09-30 12:17:32,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:17:32,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:17:34,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:36,614 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=709280.0, ans=0.125 2023-09-30 12:17:37,838 INFO [train.py:1039] (2/4) Epoch 21, batch 150, loss[loss=0.1711, simple_loss=0.2463, pruned_loss=0.04791, over 21111.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2555, pruned_loss=0.0508, over 2519837.52 frames. ], batch size: 46, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:17:37,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:37,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:17:38,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:39,983 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=709280.0, ans=0.125 2023-09-30 12:17:41,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:41,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:43,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:44,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:49,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 12:17:49,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 12:17:49,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 12:17:54,282 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:17:54,290 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:17:54,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:55,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:55,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:56,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:57,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:00,455 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 12:18:02,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:09,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:09,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=709413.3333333334, ans=0.0 2023-09-30 12:18:09,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=709413.3333333334, ans=0.0 2023-09-30 12:18:11,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=709413.3333333334, ans=0.125 2023-09-30 12:18:12,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:18:14,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 12:18:17,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:18:17,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:17,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:20,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:18:21,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:18:22,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:18:25,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:25,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 12:18:31,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:31,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:18:33,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:18:33,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:18:33,519 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=709480.0, ans=0.0 2023-09-30 12:18:34,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:36,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 12:18:36,753 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=709480.0, ans=0.125 2023-09-30 12:18:37,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:18:40,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:18:40,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=709480.0, ans=0.09899494936611666 2023-09-30 12:18:42,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:43,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:18:43,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 12:18:43,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=709546.6666666666, ans=0.125 2023-09-30 12:18:45,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:45,801 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 12:18:50,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:53,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:53,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:18:56,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 12:18:56,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:58,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:00,184 INFO [train.py:1039] (2/4) Epoch 21, batch 200, loss[loss=0.1771, simple_loss=0.2675, pruned_loss=0.04335, over 24040.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2568, pruned_loss=0.05232, over 2996630.01 frames. ], batch size: 80, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:19:00,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 12:19:00,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:19:00,709 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=709613.3333333334, ans=0.125 2023-09-30 12:19:02,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:03,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:08,688 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.47 vs. limit=22.5 2023-09-30 12:19:09,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:19:09,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:19:09,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:22,966 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.816e+02 2.113e+02 2.431e+02 3.187e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-30 12:19:27,956 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=709680.0, ans=0.125 2023-09-30 12:19:30,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:19:30,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:19:34,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:19:34,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:19:34,615 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=709746.6666666666, ans=0.025 2023-09-30 12:19:35,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:19:35,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:19:37,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:39,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:19:39,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:19:39,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:19:40,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 12:19:40,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:19:42,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:45,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:19:53,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:20:02,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:02,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:20:10,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:11,484 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.81 vs. limit=15.0 2023-09-30 12:20:14,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 12:20:14,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:15,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:20:15,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:17,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:20:20,271 INFO [train.py:1039] (2/4) Epoch 21, batch 250, loss[loss=0.1709, simple_loss=0.2575, pruned_loss=0.04218, over 24491.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2541, pruned_loss=0.05123, over 3388461.85 frames. ], batch size: 66, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:20:20,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 12:20:20,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:20:20,512 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 12:20:23,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:23,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:20:25,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:26,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:29,046 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=709946.6666666666, ans=0.125 2023-09-30 12:20:29,465 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.47 vs. limit=15.0 2023-09-30 12:20:30,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:20:30,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:30,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=709946.6666666666, ans=0.1 2023-09-30 12:20:32,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:20:36,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:20:39,031 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=710013.3333333334, ans=0.125 2023-09-30 12:20:46,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:20:47,359 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.89 vs. limit=10.0 2023-09-30 12:20:48,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:49,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:20:50,400 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=710013.3333333334, ans=0.125 2023-09-30 12:20:56,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:20:56,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=710080.0, ans=0.125 2023-09-30 12:20:57,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:20:57,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:20:57,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:20:59,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:20:59,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:21:00,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:21:02,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:21:05,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 12:21:06,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:21:08,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:21:09,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:21:09,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:21:09,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:12,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:21:12,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:21:14,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:15,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:21:15,904 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=710146.6666666666, ans=0.125 2023-09-30 12:21:17,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:21,587 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:21:24,447 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.96 vs. limit=12.0 2023-09-30 12:21:26,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:29,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:21:36,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:38,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:21:40,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 12:21:42,808 INFO [train.py:1039] (2/4) Epoch 21, batch 300, loss[loss=0.1935, simple_loss=0.2514, pruned_loss=0.06782, over 23757.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2526, pruned_loss=0.05037, over 3691964.59 frames. ], batch size: 164, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:21:43,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:21:43,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:44,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 12:21:44,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:21:46,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:21:46,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 12:21:52,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:53,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:21:56,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:21:58,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 12:21:58,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:22:00,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:22:00,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 12:22:00,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:05,040 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.836e+02 2.048e+02 2.213e+02 3.686e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:22:05,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:22:08,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:22:08,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 12:22:14,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 12:22:14,286 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:17,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:18,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:18,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 12:22:18,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:22:21,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:22:21,925 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=710413.3333333334, ans=0.0 2023-09-30 12:22:24,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:22:24,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:29,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:22:29,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 12:22:31,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:22:32,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:34,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 12:22:34,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:35,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=710480.0, ans=0.125 2023-09-30 12:22:38,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:22:41,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:22:41,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 12:22:46,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:46,739 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:22:50,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:52,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:22:53,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 12:22:53,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:22:54,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:55,320 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=710546.6666666666, ans=0.2 2023-09-30 12:22:56,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 12:22:56,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:56,611 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:22:58,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:59,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:59,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:04,736 INFO [train.py:1039] (2/4) Epoch 21, batch 350, loss[loss=0.1627, simple_loss=0.2414, pruned_loss=0.04204, over 24452.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2507, pruned_loss=0.04991, over 3915555.00 frames. ], batch size: 58, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:23:04,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:04,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:23:09,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:11,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=710613.3333333334, ans=0.125 2023-09-30 12:23:14,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:23:17,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:17,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:21,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 12:23:21,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:23,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 12:23:26,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:26,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 12:23:26,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:31,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 12:23:32,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:23:32,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:34,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:23:37,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:37,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:38,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:23:38,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:38,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:23:41,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:23:41,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:47,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:23:47,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:23:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:23:50,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:56,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 12:23:56,296 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:24:01,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:01,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:01,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:24:03,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 12:24:04,879 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=710813.3333333334, ans=0.1 2023-09-30 12:24:06,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=710813.3333333334, ans=0.0 2023-09-30 12:24:07,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:08,914 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 12:24:10,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 12:24:10,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:15,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:24:15,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 12:24:17,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:18,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:24:22,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:22,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:22,221 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:25,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:26,556 INFO [train.py:1039] (2/4) Epoch 21, batch 400, loss[loss=0.1761, simple_loss=0.2496, pruned_loss=0.05134, over 23834.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.25, pruned_loss=0.04971, over 4077086.64 frames. ], batch size: 195, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:24:28,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:24:30,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:24:30,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 12:24:32,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:32,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:34,928 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=710946.6666666666, ans=0.1 2023-09-30 12:24:36,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:24:36,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:39,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:40,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:42,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 12:24:42,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=711013.3333333334, ans=10.0 2023-09-30 12:24:43,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 12:24:43,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:44,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 12:24:44,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:49,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:24:49,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:49,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 12:24:49,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:24:49,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:50,707 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.787e+02 1.990e+02 2.388e+02 3.650e+02, threshold=3.979e+02, percent-clipped=0.0 2023-09-30 12:24:50,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:50,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:52,517 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 12:24:52,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 12:24:57,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:59,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:59,565 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=711080.0, ans=0.1 2023-09-30 12:25:00,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 12:25:00,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 12:25:04,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:25:09,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:15,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 12:25:18,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:25:21,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 12:25:22,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:25:25,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:25:25,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 12:25:30,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:25:31,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:25:33,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:25:37,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:37,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 12:25:37,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=711213.3333333334, ans=0.0 2023-09-30 12:25:41,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:25:41,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 12:25:44,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:25:44,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:25:46,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 12:25:47,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:25:49,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:25:49,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:25:50,096 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.69 vs. limit=15.0 2023-09-30 12:25:50,701 INFO [train.py:1039] (2/4) Epoch 21, batch 450, loss[loss=0.1928, simple_loss=0.2627, pruned_loss=0.06148, over 23469.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2512, pruned_loss=0.05023, over 4207185.54 frames. ], batch size: 285, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:25:50,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 12:25:50,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:25:51,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:25:52,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:25:52,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 12:25:54,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:25:56,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:25:57,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:26:05,480 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=711346.6666666666, ans=0.125 2023-09-30 12:26:08,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:10,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:12,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 12:26:12,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 12:26:12,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=711346.6666666666, ans=0.125 2023-09-30 12:26:15,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:26:17,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:19,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:22,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:24,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:28,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 12:26:28,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 12:26:29,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 12:26:29,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:26:31,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:32,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:26:34,442 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 12:26:34,456 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 12:26:34,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:36,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:26:36,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=711413.3333333334, ans=0.04949747468305833 2023-09-30 12:26:37,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:26:43,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:26:44,408 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:26:44,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:26:44,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 12:26:49,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:51,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:26:51,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:26:54,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 12:26:54,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=711480.0, ans=0.0 2023-09-30 12:26:58,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:26:58,274 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=711546.6666666666, ans=0.2 2023-09-30 12:26:59,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 12:27:00,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 12:27:01,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:27:07,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:27:09,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:09,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=711546.6666666666, ans=0.125 2023-09-30 12:27:10,841 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:27:10,903 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 12:27:13,755 INFO [train.py:1039] (2/4) Epoch 21, batch 500, loss[loss=0.1791, simple_loss=0.2664, pruned_loss=0.04586, over 24419.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2521, pruned_loss=0.05068, over 4310944.99 frames. ], batch size: 69, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:27:15,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:17,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:27:17,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:17,256 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 12:27:19,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 12:27:19,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:21,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:27:25,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:27:27,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:27:30,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:30,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:32,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:33,154 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=711680.0, ans=0.125 2023-09-30 12:27:37,394 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.852e+02 2.048e+02 2.259e+02 3.327e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:27:42,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:42,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:27:42,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:27:42,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:44,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 12:27:44,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:27:46,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:27:47,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:27:48,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:27:48,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:49,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 12:27:52,589 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 12:27:56,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:27:56,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:57,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:28:02,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 12:28:05,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:28:07,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:07,842 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:28:10,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:14,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:28:19,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:20,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 12:28:20,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:20,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:23,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 12:28:25,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:28:27,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:31,348 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=12.0 2023-09-30 12:28:33,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 12:28:35,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 12:28:35,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:35,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 12:28:35,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:28:35,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:37,325 INFO [train.py:1039] (2/4) Epoch 21, batch 550, loss[loss=0.1593, simple_loss=0.2402, pruned_loss=0.03926, over 24291.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2531, pruned_loss=0.05097, over 4404169.84 frames. ], batch size: 61, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:28:37,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:28:37,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=711946.6666666666, ans=0.125 2023-09-30 12:28:39,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:28:42,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:43,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 12:28:43,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:28:48,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:28:48,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:52,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:28:55,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:56,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=712013.3333333334, ans=0.125 2023-09-30 12:28:59,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 12:29:00,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 12:29:02,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:29:03,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=712013.3333333334, ans=0.5 2023-09-30 12:29:08,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:29:08,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:10,014 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=712080.0, ans=0.125 2023-09-30 12:29:11,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:29:15,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:15,076 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 12:29:15,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:29:16,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:29:19,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:19,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:29:21,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:29:21,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:23,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 12:29:26,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 12:29:27,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:27,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:29:29,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:29:29,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:29:33,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:29:35,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:29:36,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=712146.6666666666, ans=0.0 2023-09-30 12:29:37,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:29:37,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:39,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 12:29:41,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:29:42,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:44,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:29:45,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:46,067 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=712213.3333333334, ans=0.0 2023-09-30 12:29:47,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:29:47,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:29:52,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 12:29:54,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 12:29:54,706 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=712213.3333333334, ans=0.0 2023-09-30 12:29:58,033 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:29:58,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:29:58,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:59,455 INFO [train.py:1039] (2/4) Epoch 21, batch 600, loss[loss=0.161, simple_loss=0.2248, pruned_loss=0.04862, over 22821.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2526, pruned_loss=0.05029, over 4481523.76 frames. ], batch size: 322, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:30:05,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:30:07,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:30:09,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 12:30:11,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:30:12,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:14,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:17,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 12:30:17,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:30:21,803 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.803e+02 1.996e+02 2.235e+02 3.480e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 12:30:24,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 12:30:27,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:30:27,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:27,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:30:34,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:30:34,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:30:34,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:40,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:30:44,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:44,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:44,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:53,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 12:31:00,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:31:00,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:04,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 12:31:05,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:31:08,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 12:31:08,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:31:08,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:31:15,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:31:15,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:31:17,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:31:19,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:31:21,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:22,618 INFO [train.py:1039] (2/4) Epoch 21, batch 650, loss[loss=0.1898, simple_loss=0.2698, pruned_loss=0.05489, over 23863.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2512, pruned_loss=0.05021, over 4524739.85 frames. ], batch size: 86, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:31:24,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 12:31:25,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:31:31,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:31:31,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:36,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:40,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 12:31:41,117 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=712680.0, ans=0.125 2023-09-30 12:31:43,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:31:43,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:47,257 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=712680.0, ans=0.0 2023-09-30 12:31:49,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:50,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:31:54,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:54,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:56,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:31:57,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:59,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:32:01,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:32:01,065 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 12:32:01,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:01,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:04,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:05,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:07,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:07,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:32:08,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 12:32:08,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:32:08,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:32:11,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:32:11,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:14,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:32:14,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 12:32:15,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 12:32:15,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:15,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:15,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:32:15,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:32:17,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=712813.3333333334, ans=0.0 2023-09-30 12:32:18,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:32:20,510 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=712813.3333333334, ans=0.0 2023-09-30 12:32:27,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:27,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:32:28,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:31,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:31,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:32:33,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:38,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=712880.0, ans=0.05 2023-09-30 12:32:41,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:32:41,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:42,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:32:42,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:45,051 INFO [train.py:1039] (2/4) Epoch 21, batch 700, loss[loss=0.1609, simple_loss=0.2399, pruned_loss=0.04094, over 23093.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2506, pruned_loss=0.04973, over 4584467.80 frames. ], batch size: 50, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:32:45,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=712946.6666666666, ans=0.0 2023-09-30 12:32:45,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=712946.6666666666, ans=0.125 2023-09-30 12:32:46,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 12:32:48,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 12:32:51,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 12:32:51,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:52,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:32:54,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 12:32:57,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=712946.6666666666, ans=0.1 2023-09-30 12:32:59,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:02,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:33:04,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:06,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:33:07,182 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.845e+02 2.008e+02 2.196e+02 3.321e+02, threshold=4.016e+02, percent-clipped=0.0 2023-09-30 12:33:07,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:33:10,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:12,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:33:12,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:33:13,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 12:33:17,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 12:33:21,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:33:21,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:33:22,243 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=713080.0, ans=0.125 2023-09-30 12:33:23,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:33:26,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:33:26,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 12:33:31,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:31,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:33:34,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 12:33:35,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:37,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:38,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:33:43,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:33:43,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 12:33:49,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 12:33:49,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 12:33:52,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:54,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:33:55,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:33:58,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:58,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 12:34:02,493 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=713213.3333333334, ans=0.2 2023-09-30 12:34:03,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 12:34:03,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 12:34:03,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 12:34:05,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 12:34:06,590 INFO [train.py:1039] (2/4) Epoch 21, batch 750, loss[loss=0.1775, simple_loss=0.2619, pruned_loss=0.04653, over 24326.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2505, pruned_loss=0.04993, over 4601867.95 frames. ], batch size: 77, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:34:06,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 12:34:06,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:34:08,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 12:34:10,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:34:11,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:12,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:15,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:16,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:34:16,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:19,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:34:19,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:34:21,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:34:21,977 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=713346.6666666666, ans=0.2 2023-09-30 12:34:26,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:26,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:27,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 12:34:29,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:34:29,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:30,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:33,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:34:35,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 12:34:35,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:34:39,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 12:34:39,177 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 12:34:40,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 12:34:40,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:34:40,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:34:43,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:34:51,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:51,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:34:51,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:34:52,427 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=713413.3333333334, ans=0.0 2023-09-30 12:34:53,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:53,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:53,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 12:34:55,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:34:56,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:34:57,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:35:02,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:35:04,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 12:35:05,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:10,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:12,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:35:12,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:14,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:35:18,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 12:35:18,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:18,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:23,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:23,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:25,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:25,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:35:28,085 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.06 vs. limit=15.0 2023-09-30 12:35:30,166 INFO [train.py:1039] (2/4) Epoch 21, batch 800, loss[loss=0.1608, simple_loss=0.2305, pruned_loss=0.0455, over 23739.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2512, pruned_loss=0.04987, over 4618035.71 frames. ], batch size: 135, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:35:36,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:36,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:40,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:40,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:40,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:41,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:43,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:48,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:49,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:35:52,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 12:35:52,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:53,661 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.844e+02 2.013e+02 2.212e+02 3.409e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 12:35:53,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:53,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:35:53,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:35:55,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 12:35:55,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:55,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 12:36:00,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:02,103 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:05,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:36:05,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:36:06,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:06,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:13,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:36:13,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:36:13,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 12:36:15,747 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 12:36:15,794 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 12:36:15,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:36:15,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:18,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:18,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:36:24,024 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 12:36:25,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 12:36:26,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:36:27,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=713813.3333333334, ans=0.125 2023-09-30 12:36:28,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:36:33,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:36:36,838 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:38,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 12:36:38,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:36:42,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 12:36:47,761 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=713880.0, ans=0.125 2023-09-30 12:36:51,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:36:53,263 INFO [train.py:1039] (2/4) Epoch 21, batch 850, loss[loss=0.2336, simple_loss=0.2949, pruned_loss=0.08611, over 19361.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2521, pruned_loss=0.05004, over 4648536.46 frames. ], batch size: 388, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:36:53,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:36:53,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 12:36:54,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:36:54,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:57,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 12:36:57,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:58,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:58,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:00,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:37:01,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:37:03,564 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=713946.6666666666, ans=0.125 2023-09-30 12:37:04,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 12:37:04,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 12:37:04,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 12:37:06,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:37:06,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:08,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:09,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:37:09,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:37:12,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=714013.3333333334, ans=0.1 2023-09-30 12:37:15,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:15,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:15,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 12:37:19,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 12:37:23,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:24,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 12:37:28,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 12:37:29,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 12:37:33,300 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 12:37:33,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:33,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:37:33,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:37:36,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 12:37:39,768 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=714080.0, ans=0.125 2023-09-30 12:37:40,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:42,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:42,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:37:44,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:37:44,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:37:46,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:37:46,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 12:37:48,500 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=714146.6666666666, ans=0.1 2023-09-30 12:37:51,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:37:51,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:37:52,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:37:52,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:52,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:53,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=714146.6666666666, ans=0.125 2023-09-30 12:37:54,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:54,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=714146.6666666666, ans=0.05 2023-09-30 12:37:56,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:37:58,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:37:59,412 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.72 vs. limit=15.0 2023-09-30 12:37:59,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:01,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:38:09,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:38:09,735 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=714213.3333333334, ans=0.2 2023-09-30 12:38:11,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:38:11,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 12:38:12,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:12,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:38:14,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 12:38:15,645 INFO [train.py:1039] (2/4) Epoch 21, batch 900, loss[loss=0.1643, simple_loss=0.2385, pruned_loss=0.04501, over 24453.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2518, pruned_loss=0.04971, over 4667208.31 frames. ], batch size: 58, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:38:19,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:38:22,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:24,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 12:38:24,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=714280.0, ans=0.125 2023-09-30 12:38:29,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:38:29,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 12:38:31,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:38:32,079 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.41 vs. limit=15.0 2023-09-30 12:38:32,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:32,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:32,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:38:32,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:38:39,494 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.822e+02 2.031e+02 2.211e+02 2.952e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 12:38:42,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:38:42,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:42,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:38:44,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:51,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 12:38:53,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:39:00,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:39:00,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:39:01,945 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 12:39:02,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 12:39:02,276 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=714413.3333333334, ans=0.125 2023-09-30 12:39:09,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:39:09,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:39:09,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=714480.0, ans=0.1 2023-09-30 12:39:10,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:39:19,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:19,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:39:19,370 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=714480.0, ans=0.0 2023-09-30 12:39:21,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 12:39:21,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:39:24,284 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.81 vs. limit=15.0 2023-09-30 12:39:25,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 12:39:26,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:39:26,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:28,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:39:29,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:39:35,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 12:39:35,183 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 12:39:36,777 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:39:36,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 12:39:38,195 INFO [train.py:1039] (2/4) Epoch 21, batch 950, loss[loss=0.218, simple_loss=0.2779, pruned_loss=0.07909, over 19719.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2518, pruned_loss=0.05011, over 4674606.46 frames. ], batch size: 388, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:39:39,833 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:45,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 12:39:48,264 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.78 vs. limit=10.0 2023-09-30 12:39:48,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:39:52,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:52,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:54,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:39:57,188 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 12:40:01,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:01,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:02,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:02,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:40:03,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 12:40:03,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:40:05,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:06,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 12:40:07,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:11,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=714746.6666666666, ans=0.125 2023-09-30 12:40:12,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:13,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:13,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:40:15,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 12:40:17,593 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:40:19,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:21,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:40:23,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=714746.6666666666, ans=0.0 2023-09-30 12:40:26,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:40:26,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:29,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 12:40:31,385 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:40:31,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:40:33,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=714813.3333333334, ans=0.0 2023-09-30 12:40:34,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:34,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:34,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:40:37,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 12:40:38,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:40:40,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:42,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:42,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 12:40:42,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:42,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:40:42,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 12:40:43,227 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:40:47,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:40:51,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:54,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:40:56,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 12:40:56,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 12:40:59,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:41:02,680 INFO [train.py:1039] (2/4) Epoch 21, batch 1000, loss[loss=0.1578, simple_loss=0.2212, pruned_loss=0.04718, over 23645.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2507, pruned_loss=0.05003, over 4678009.35 frames. ], batch size: 232, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:41:02,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 12:41:02,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:08,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:41:10,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 12:41:10,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 12:41:15,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:15,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:41:18,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:18,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=715013.3333333334, ans=0.1 2023-09-30 12:41:19,303 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.84 vs. limit=22.5 2023-09-30 12:41:21,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 12:41:25,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 12:41:26,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=715013.3333333334, ans=0.125 2023-09-30 12:41:27,770 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.831e+02 2.095e+02 2.362e+02 3.753e+02, threshold=4.190e+02, percent-clipped=0.0 2023-09-30 12:41:27,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 12:41:29,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:30,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 12:41:33,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 12:41:33,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 12:41:34,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:35,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:44,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:45,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:41:46,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:48,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:48,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 12:41:48,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:50,053 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:41:51,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:51,659 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 12:41:54,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 12:41:56,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 12:41:59,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 12:42:02,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:42:08,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:08,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:42:10,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:10,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:42:11,094 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=12.0 2023-09-30 12:42:12,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 12:42:13,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:42:13,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 12:42:13,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 12:42:15,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:15,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:42:18,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:42:21,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:42:23,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:42:25,002 INFO [train.py:1039] (2/4) Epoch 21, batch 1050, loss[loss=0.1845, simple_loss=0.2512, pruned_loss=0.05887, over 23719.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2489, pruned_loss=0.04972, over 4674405.11 frames. ], batch size: 179, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:42:25,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:42:26,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:42:28,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:42:29,688 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:31,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:42:35,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:42:36,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:42:39,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:42:40,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:42:40,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:42:41,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:42:43,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 12:42:43,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:43,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 12:42:44,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:44,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 12:42:46,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:42:52,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:52,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:42:52,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:57,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 12:42:57,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 12:42:58,040 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=715413.3333333334, ans=0.2 2023-09-30 12:42:59,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:43:00,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 12:43:05,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 12:43:06,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:10,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:43:11,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:43:13,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:43:13,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:43:16,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:43:19,704 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 12:43:21,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 12:43:21,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 12:43:22,050 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.88 vs. limit=15.0 2023-09-30 12:43:22,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:22,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:43:24,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 12:43:29,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:43:32,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:32,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:43:32,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:32,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:37,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:37,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 12:43:40,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:40,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 12:43:41,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 12:43:42,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:43:45,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:43:47,960 INFO [train.py:1039] (2/4) Epoch 21, batch 1100, loss[loss=0.1646, simple_loss=0.2551, pruned_loss=0.037, over 24491.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2486, pruned_loss=0.04936, over 4664524.51 frames. ], batch size: 69, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:43:51,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:43:57,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:43:58,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:43:58,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:00,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 12:44:01,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:44:05,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:44:07,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:44:07,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.13 vs. limit=15.0 2023-09-30 12:44:10,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:44:10,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 12:44:11,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:44:13,692 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.860e+02 2.075e+02 2.430e+02 4.755e+02, threshold=4.150e+02, percent-clipped=2.0 2023-09-30 12:44:13,897 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:13,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:44:17,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:44:19,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:44:24,362 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:44:28,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 12:44:28,966 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 12:44:30,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:44:33,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=715746.6666666666, ans=0.0 2023-09-30 12:44:33,964 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:44:35,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:44:38,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 12:44:38,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:44:38,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:44:38,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:44:38,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:40,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 12:44:46,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:44:46,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 12:44:50,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:44:53,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:44:56,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 12:44:56,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:44:57,070 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=715880.0, ans=0.1 2023-09-30 12:44:58,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:00,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:00,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:00,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=715880.0, ans=0.0 2023-09-30 12:45:03,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 12:45:03,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=715880.0, ans=0.0 2023-09-30 12:45:04,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:45:04,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:05,158 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=715880.0, ans=0.125 2023-09-30 12:45:06,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 12:45:06,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:45:07,327 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.37 vs. limit=15.0 2023-09-30 12:45:07,819 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 12:45:07,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:45:08,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:45:09,407 INFO [train.py:1039] (2/4) Epoch 21, batch 1150, loss[loss=0.1707, simple_loss=0.2447, pruned_loss=0.04836, over 23362.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2494, pruned_loss=0.04945, over 4685639.95 frames. ], batch size: 134, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:45:09,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:45:16,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:18,114 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=715946.6666666666, ans=0.125 2023-09-30 12:45:19,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:45:20,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:20,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:45:21,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 12:45:22,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:25,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 12:45:26,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:26,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:45:28,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=716013.3333333334, ans=0.125 2023-09-30 12:45:31,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 12:45:33,829 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:38,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:38,413 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:45:38,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 12:45:38,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:45:38,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:44,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 12:45:45,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:48,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:56,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:46:01,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:46:01,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 12:46:03,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:03,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:11,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=716146.6666666666, ans=0.04949747468305833 2023-09-30 12:46:12,679 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 12:46:14,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:22,551 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 12:46:24,780 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.37 vs. limit=12.0 2023-09-30 12:46:25,725 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:27,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:46:27,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:46:27,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:46:32,262 INFO [train.py:1039] (2/4) Epoch 21, batch 1200, loss[loss=0.1468, simple_loss=0.2225, pruned_loss=0.03556, over 24462.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2495, pruned_loss=0.04946, over 4692016.43 frames. ], batch size: 58, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:46:32,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:34,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=716280.0, ans=0.125 2023-09-30 12:46:37,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:46:37,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:46:40,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:46:40,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:40,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:46:42,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:46:44,004 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:46:46,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.86 vs. limit=6.0 2023-09-30 12:46:47,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:47,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:50,443 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 12:46:52,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 12:46:57,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:46:57,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=716346.6666666666, ans=0.2 2023-09-30 12:46:58,568 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.836e+02 2.061e+02 2.415e+02 4.765e+02, threshold=4.121e+02, percent-clipped=1.0 2023-09-30 12:47:00,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:47:01,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:03,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:03,330 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 12:47:04,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:12,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:47:12,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:47:13,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 12:47:13,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=716413.3333333334, ans=0.2 2023-09-30 12:47:14,953 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:47:19,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 12:47:24,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 12:47:24,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:25,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:47:27,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:28,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:47:30,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:30,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:47:32,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:47:32,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 12:47:34,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:47:34,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:34,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:47:37,423 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:47:37,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:41,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:47:42,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:47:46,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 12:47:49,760 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 12:47:52,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:53,992 INFO [train.py:1039] (2/4) Epoch 21, batch 1250, loss[loss=0.1723, simple_loss=0.2507, pruned_loss=0.04698, over 24645.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2511, pruned_loss=0.04997, over 4704511.37 frames. ], batch size: 65, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:47:54,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:57,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:47:59,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:48:03,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 12:48:05,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:48:07,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:07,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 12:48:10,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:48:12,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:48:15,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:48:17,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:17,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:48:17,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:21,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:48:21,327 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=716680.0, ans=0.0 2023-09-30 12:48:25,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:48:25,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:48:25,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:48:27,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:28,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:31,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:33,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:48:38,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 12:48:39,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:48:42,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=716813.3333333334, ans=0.125 2023-09-30 12:48:43,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:48:44,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 12:48:45,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:45,080 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 12:48:45,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:45,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:50,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:52,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:52,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:48:53,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 12:48:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 12:48:55,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 12:48:58,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:00,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 12:49:00,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:03,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:49:03,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:49:05,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 12:49:05,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:49:06,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:49:06,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:49:06,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:07,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=716880.0, ans=0.125 2023-09-30 12:49:10,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 12:49:12,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:12,855 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716880.0, ans=0.1 2023-09-30 12:49:14,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:49:14,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:49:17,277 INFO [train.py:1039] (2/4) Epoch 21, batch 1300, loss[loss=0.189, simple_loss=0.2728, pruned_loss=0.05261, over 23730.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2513, pruned_loss=0.05006, over 4704562.97 frames. ], batch size: 85, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:49:17,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:49:20,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:21,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 12:49:22,202 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=716946.6666666666, ans=0.125 2023-09-30 12:49:27,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:28,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:49:30,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:49:31,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:31,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:49:33,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 12:49:37,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:49:38,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=717013.3333333334, ans=0.2 2023-09-30 12:49:39,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:49:39,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 12:49:43,774 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.834e+02 2.043e+02 2.344e+02 3.785e+02, threshold=4.086e+02, percent-clipped=0.0 2023-09-30 12:49:45,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:49:50,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:50,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:51,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:53,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:54,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:49:56,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:49:56,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 12:50:02,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:50:02,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:50:05,168 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 12:50:05,259 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:50:06,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:50:08,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:50:08,719 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:50:09,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 12:50:10,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:11,492 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 12:50:13,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:16,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:50:16,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:50:22,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 12:50:22,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 12:50:25,145 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 12:50:27,110 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=717213.3333333334, ans=0.125 2023-09-30 12:50:28,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:50:31,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 12:50:33,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:35,559 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=717213.3333333334, ans=0.0 2023-09-30 12:50:39,720 INFO [train.py:1039] (2/4) Epoch 21, batch 1350, loss[loss=0.1846, simple_loss=0.2647, pruned_loss=0.05219, over 23692.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2508, pruned_loss=0.04976, over 4711780.52 frames. ], batch size: 85, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:50:39,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 12:50:41,637 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=717280.0, ans=0.125 2023-09-30 12:50:42,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:44,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:50:46,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:48,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:50,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:50:51,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:55,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:56,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 12:50:58,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:50:59,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:51:02,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 12:51:04,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:51:05,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:51:05,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 12:51:06,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 12:51:09,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 12:51:12,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:12,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 12:51:24,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:27,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=717480.0, ans=0.0 2023-09-30 12:51:33,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:35,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:35,090 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 12:51:38,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:40,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=717480.0, ans=0.125 2023-09-30 12:51:41,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 12:51:41,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:51:41,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:51:45,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:51:47,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 12:51:48,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:51:54,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 12:51:55,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 12:51:56,137 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=717546.6666666666, ans=0.1 2023-09-30 12:52:01,860 INFO [train.py:1039] (2/4) Epoch 21, batch 1400, loss[loss=0.1789, simple_loss=0.2584, pruned_loss=0.04964, over 23704.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2487, pruned_loss=0.04939, over 4706858.64 frames. ], batch size: 85, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:52:02,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 12:52:04,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:52:07,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:52:08,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:52:12,137 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 12:52:13,697 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 12:52:23,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=717680.0, ans=0.025 2023-09-30 12:52:24,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:52:28,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:30,111 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.890e+02 2.143e+02 2.435e+02 3.256e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-30 12:52:30,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:52:30,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:52:32,273 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:52:35,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:52:36,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:52:45,930 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=717746.6666666666, ans=10.0 2023-09-30 12:52:48,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:48,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:55,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 12:52:55,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:52:55,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:52:57,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:52:57,156 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:58,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:52:58,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:52:58,723 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:53:01,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 12:53:01,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:53:06,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:10,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:53:17,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 12:53:17,504 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=717880.0, ans=0.125 2023-09-30 12:53:18,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:53:19,003 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=717880.0, ans=0.125 2023-09-30 12:53:20,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:53:21,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:53:23,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:25,611 INFO [train.py:1039] (2/4) Epoch 21, batch 1450, loss[loss=0.1777, simple_loss=0.2479, pruned_loss=0.05376, over 23686.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2484, pruned_loss=0.04909, over 4714065.34 frames. ], batch size: 232, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:53:25,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:53:28,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:53:31,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:53:31,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:31,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:53:37,029 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=717946.6666666666, ans=10.0 2023-09-30 12:53:38,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:39,726 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:53:41,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:53:41,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 12:53:42,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:53:44,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 12:53:46,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:46,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:46,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 12:53:48,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:53:49,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:53:49,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 12:53:49,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:51,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:53:53,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:56,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:54:00,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:54:00,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:54:03,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:54:03,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:04,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:54:04,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:54:04,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:06,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:10,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 12:54:12,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:54:17,845 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 12:54:19,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:20,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:54:22,370 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:22,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 12:54:23,283 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.13 vs. limit=15.0 2023-09-30 12:54:28,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:28,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 12:54:31,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 12:54:31,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:34,538 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=718213.3333333334, ans=0.125 2023-09-30 12:54:35,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:54:35,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:38,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 12:54:41,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 12:54:41,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 12:54:42,731 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:44,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:54:44,381 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=718213.3333333334, ans=0.1 2023-09-30 12:54:47,160 INFO [train.py:1039] (2/4) Epoch 21, batch 1500, loss[loss=0.1719, simple_loss=0.2634, pruned_loss=0.04014, over 24294.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2497, pruned_loss=0.04931, over 4724305.37 frames. ], batch size: 74, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:54:55,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 12:54:55,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:54:55,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:54:57,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:57,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:54:57,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=718280.0, ans=0.125 2023-09-30 12:54:59,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:54:59,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 12:55:01,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:55:01,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:55:02,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:55:03,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:55:05,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:06,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:13,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:13,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 12:55:13,627 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=718346.6666666666, ans=0.0 2023-09-30 12:55:14,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:55:16,112 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.806e+02 1.957e+02 2.271e+02 3.905e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 12:55:16,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:55:17,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:20,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 12:55:24,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 12:55:26,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:55:26,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 12:55:29,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:55:30,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:55:32,327 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:32,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:55:32,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 12:55:32,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=718413.3333333334, ans=0.1 2023-09-30 12:55:34,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:55:34,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:36,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 12:55:36,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:36,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=718480.0, ans=0.125 2023-09-30 12:55:41,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:55:41,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 12:55:48,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:55:49,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:55:52,942 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 12:55:53,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:55:53,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 12:55:53,503 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=718546.6666666666, ans=0.125 2023-09-30 12:55:54,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:56,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:55:56,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=718546.6666666666, ans=0.2 2023-09-30 12:55:58,044 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 12:55:58,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=718546.6666666666, ans=0.125 2023-09-30 12:55:59,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:56:01,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 12:56:02,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:06,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:07,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:08,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:09,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:09,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:56:09,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 12:56:11,623 INFO [train.py:1039] (2/4) Epoch 21, batch 1550, loss[loss=0.1854, simple_loss=0.2559, pruned_loss=0.05743, over 23737.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.251, pruned_loss=0.05029, over 4712141.91 frames. ], batch size: 179, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:56:11,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 12:56:11,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:56:13,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 12:56:14,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 12:56:16,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:18,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:19,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:19,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:56:21,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:23,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:25,266 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 12:56:25,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:26,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:56:28,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:56:29,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:56:29,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 12:56:30,734 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.39 vs. limit=15.0 2023-09-30 12:56:31,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:33,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 12:56:33,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 12:56:35,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 12:56:35,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:38,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:41,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:56:44,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 12:56:44,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 12:56:54,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:57,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:57,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:56:57,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:56:59,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 12:56:59,553 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=718813.3333333334, ans=0.1 2023-09-30 12:56:59,943 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.70 vs. limit=22.5 2023-09-30 12:57:04,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:57:06,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:06,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=718813.3333333334, ans=0.125 2023-09-30 12:57:06,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=718813.3333333334, ans=0.1 2023-09-30 12:57:11,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:57:14,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:57:14,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:57:15,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 12:57:15,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:16,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:57:17,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:17,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 12:57:17,616 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 12:57:22,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:27,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 12:57:32,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:34,721 INFO [train.py:1039] (2/4) Epoch 21, batch 1600, loss[loss=0.1628, simple_loss=0.2482, pruned_loss=0.03873, over 24351.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2506, pruned_loss=0.05002, over 4712778.23 frames. ], batch size: 74, lr: 4.92e-03, grad_scale: 16.0 2023-09-30 12:57:34,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:34,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 12:57:36,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:37,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:37,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:57:39,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:57:39,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:57:41,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:43,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 12:57:44,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 12:57:47,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 12:57:50,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:57:52,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 12:57:52,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:57:53,426 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.80 vs. limit=15.0 2023-09-30 12:57:56,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:57:59,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:58:01,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 12:58:04,558 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.831e+02 2.011e+02 2.218e+02 3.597e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:58:04,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:58:06,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 12:58:06,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:07,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 12:58:11,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 12:58:20,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:21,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 12:58:21,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:23,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:58:23,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:58:24,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 12:58:28,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 12:58:30,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:58:30,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:31,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:33,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:58:34,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:58:36,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:58:37,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:58:43,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:43,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:58:47,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 12:58:47,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:58:48,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 12:58:53,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:58:56,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:58:56,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:58:56,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 12:58:56,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 12:58:56,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 12:58:56,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 12:58:57,950 INFO [train.py:1039] (2/4) Epoch 21, batch 1650, loss[loss=0.1829, simple_loss=0.2666, pruned_loss=0.04966, over 24427.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2517, pruned_loss=0.05016, over 4723828.17 frames. ], batch size: 77, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:59:01,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:59:01,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:01,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:03,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:59:06,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:08,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 12:59:13,458 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:59:13,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:13,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:59:13,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:59:15,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 12:59:15,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 12:59:21,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:59:24,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:59:32,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 12:59:34,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:36,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 12:59:40,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:59:42,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:59:44,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:59:46,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:59:47,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:59:47,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:48,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:49,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:49,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:50,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:59:52,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:52,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:59:56,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:57,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 13:00:00,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:00:00,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 13:00:02,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 13:00:02,314 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 13:00:02,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:03,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:00:03,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:05,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:00:05,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 13:00:08,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:11,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=719546.6666666666, ans=0.2 2023-09-30 13:00:12,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:00:12,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:15,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 13:00:20,674 INFO [train.py:1039] (2/4) Epoch 21, batch 1700, loss[loss=0.1667, simple_loss=0.2331, pruned_loss=0.05018, over 23598.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2503, pruned_loss=0.04985, over 4723583.03 frames. ], batch size: 256, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:00:20,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:20,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:00:20,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 13:00:20,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:20,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:00:20,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:24,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:00:24,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:00:25,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 13:00:27,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:00:37,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:40,202 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:00:44,360 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=719680.0, ans=0.125 2023-09-30 13:00:45,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:00:45,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:00:45,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:47,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:00:49,899 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.892e+02 2.034e+02 2.348e+02 3.587e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 13:00:50,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 13:00:50,445 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=719680.0, ans=0.0 2023-09-30 13:00:51,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:00:51,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:55,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:00:55,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:00:58,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 13:00:59,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 13:01:00,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:02,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 13:01:03,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:01:14,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:16,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:16,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:01:18,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:01:18,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 13:01:18,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:01:21,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:21,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 13:01:21,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:01:21,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:23,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:23,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:24,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:24,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:01:26,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:26,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:01:26,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:33,174 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:34,743 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 13:01:36,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:38,416 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:41,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 13:01:42,778 INFO [train.py:1039] (2/4) Epoch 21, batch 1750, loss[loss=0.175, simple_loss=0.258, pruned_loss=0.04599, over 24576.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2495, pruned_loss=0.04962, over 4712820.91 frames. ], batch size: 71, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:01:46,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:47,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:47,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=719946.6666666666, ans=0.125 2023-09-30 13:01:49,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:01:49,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 13:01:50,769 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:54,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:01:54,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:54,691 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=719946.6666666666, ans=0.025 2023-09-30 13:02:02,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 13:02:02,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=720013.3333333334, ans=0.125 2023-09-30 13:02:04,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:06,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 13:02:06,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:08,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:02:10,383 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=720013.3333333334, ans=0.04949747468305833 2023-09-30 13:02:11,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:02:13,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 13:02:15,188 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:02:16,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 13:02:24,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:02:28,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:02:28,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:31,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:31,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:32,785 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:02:34,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:36,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:36,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:37,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 13:02:39,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:41,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 13:02:41,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:43,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:45,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:02:48,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:02:48,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 13:02:48,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:51,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:56,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:59,173 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.03 vs. limit=15.0 2023-09-30 13:02:59,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:00,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=720213.3333333334, ans=0.0 2023-09-30 13:03:01,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:03:01,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 13:03:01,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:03,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:03:03,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:03,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:03:03,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:03:03,363 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=720213.3333333334, ans=0.0 2023-09-30 13:03:04,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:03:08,035 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=720280.0, ans=0.0 2023-09-30 13:03:09,632 INFO [train.py:1039] (2/4) Epoch 21, batch 1800, loss[loss=0.1775, simple_loss=0.2493, pruned_loss=0.05283, over 23647.00 frames. ], tot_loss[loss=0.174, simple_loss=0.249, pruned_loss=0.04947, over 4701401.04 frames. ], batch size: 149, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:03:09,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:03:09,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:03:11,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:03:13,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:18,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:03:18,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:03:21,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:23,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=720280.0, ans=0.125 2023-09-30 13:03:24,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:24,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:26,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:03:26,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=720346.6666666666, ans=0.125 2023-09-30 13:03:27,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:27,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 13:03:29,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:32,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:37,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 13:03:39,072 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.961e+02 2.256e+02 2.662e+02 3.514e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-30 13:03:39,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 13:03:40,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 13:03:40,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:41,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:41,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:03:43,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:03:49,591 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=720413.3333333334, ans=0.125 2023-09-30 13:03:51,539 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 13:03:54,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:03:56,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:58,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 13:03:58,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 13:03:59,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:04:01,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:04:02,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:04:07,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 13:04:14,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:04:14,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 13:04:16,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:04:16,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:17,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:04:17,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 13:04:20,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:04:20,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:23,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 13:04:23,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:26,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:26,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:04:26,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:27,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:28,409 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.47 vs. limit=15.0 2023-09-30 13:04:29,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:04:30,806 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:04:30,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:32,758 INFO [train.py:1039] (2/4) Epoch 21, batch 1850, loss[loss=0.1806, simple_loss=0.2507, pruned_loss=0.0553, over 23793.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.25, pruned_loss=0.04961, over 4701172.81 frames. ], batch size: 212, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:04:34,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:04:36,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:04:45,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:04:45,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 13:04:47,394 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=720680.0, ans=0.0 2023-09-30 13:04:50,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 13:04:50,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=720680.0, ans=0.1 2023-09-30 13:04:52,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=720680.0, ans=0.05 2023-09-30 13:04:55,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 13:04:58,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:58,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=720680.0, ans=0.0 2023-09-30 13:04:59,307 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.72 vs. limit=22.5 2023-09-30 13:05:00,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 13:05:00,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:05:02,233 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=720680.0, ans=0.2 2023-09-30 13:05:07,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:05:08,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 13:05:12,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:12,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:17,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 13:05:17,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:17,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:05:19,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:05:20,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:05:24,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:05:27,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:05:27,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:27,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:05:27,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:05:30,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:32,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:05:36,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 13:05:36,341 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:41,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:05:41,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:05:41,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 13:05:41,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 13:05:44,633 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 13:05:44,778 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 13:05:47,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:05:47,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:47,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:05:47,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:49,266 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 13:05:49,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:05:50,757 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:50,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:05:53,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:05:54,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:54,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 13:05:56,053 INFO [train.py:1039] (2/4) Epoch 21, batch 1900, loss[loss=0.1803, simple_loss=0.2524, pruned_loss=0.05409, over 23669.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2511, pruned_loss=0.05029, over 4711226.94 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:05:56,460 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=720946.6666666666, ans=0.0 2023-09-30 13:05:57,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:57,850 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 13:05:57,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:05:58,074 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=720946.6666666666, ans=0.125 2023-09-30 13:05:59,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:06:04,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:06:07,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:06:07,304 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 13:06:09,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 13:06:11,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:06:11,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:06:12,872 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 13:06:12,929 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 13:06:16,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 13:06:18,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:06:22,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 13:06:24,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 13:06:26,418 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.817e+02 1.990e+02 2.367e+02 3.522e+02, threshold=3.980e+02, percent-clipped=0.0 2023-09-30 13:06:35,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 13:06:40,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 13:06:40,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:06:41,627 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 13:06:41,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 13:06:41,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 13:06:41,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 13:06:41,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:06:47,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 13:06:50,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:06:55,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:06:55,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 13:06:55,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:07:01,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 13:07:01,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:08,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:07:08,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:07:08,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:07:08,950 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=721213.3333333334, ans=0.0 2023-09-30 13:07:10,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:07:10,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:07:10,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:07:11,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:07:14,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:14,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:16,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:07:16,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:07:16,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:18,221 INFO [train.py:1039] (2/4) Epoch 21, batch 1950, loss[loss=0.2364, simple_loss=0.2985, pruned_loss=0.08716, over 19133.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2524, pruned_loss=0.05097, over 4711099.92 frames. ], batch size: 388, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:07:18,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:22,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:24,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:07:25,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:25,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:07:26,757 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=721280.0, ans=0.1 2023-09-30 13:07:28,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 13:07:28,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:07:28,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:29,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:33,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:07:33,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:34,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:36,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:07:41,169 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:41,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:07:41,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:07:41,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:44,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:47,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:47,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:47,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:07:47,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 13:07:49,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:07:50,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:07:50,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:53,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:56,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:08:00,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:08:03,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:08:03,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:03,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 13:08:03,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:12,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:08:12,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:08:13,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:21,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:23,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:25,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:29,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:31,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=721546.6666666666, ans=0.07 2023-09-30 13:08:33,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:08:33,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:35,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 13:08:35,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:08:36,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:08:38,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 13:08:39,448 INFO [train.py:1039] (2/4) Epoch 21, batch 2000, loss[loss=0.1727, simple_loss=0.2637, pruned_loss=0.04086, over 24629.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2522, pruned_loss=0.05089, over 4711086.14 frames. ], batch size: 73, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:08:39,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:08:44,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:44,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:08:44,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:46,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:08:49,658 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:52,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 13:08:52,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:55,030 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=15.0 2023-09-30 13:08:55,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:08:57,544 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 13:08:59,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:08:59,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:08:59,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=721680.0, ans=0.125 2023-09-30 13:09:00,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:09:02,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 13:09:02,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:03,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:04,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:06,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 13:09:07,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:09:08,949 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.918e+02 2.130e+02 2.425e+02 4.087e+02, threshold=4.260e+02, percent-clipped=1.0 2023-09-30 13:09:10,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 13:09:10,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:15,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:15,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:09:15,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:16,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:18,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:18,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 13:09:20,716 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.06 vs. limit=6.0 2023-09-30 13:09:21,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 13:09:21,309 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:21,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:25,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:27,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:09:28,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:29,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:09:29,872 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.whiten.whitening_limit, batch_count=721813.3333333334, ans=15.0 2023-09-30 13:09:31,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:31,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:33,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:33,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:34,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:38,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:38,199 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 13:09:45,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:09:45,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:51,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:51,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:09:54,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:56,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:56,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:57,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:09:57,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:10:00,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:01,256 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.17 vs. limit=15.0 2023-09-30 13:10:02,018 INFO [train.py:1039] (2/4) Epoch 21, batch 2050, loss[loss=0.1611, simple_loss=0.2182, pruned_loss=0.05205, over 23448.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2515, pruned_loss=0.05082, over 4709426.90 frames. ], batch size: 285, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:10:02,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:05,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:10:06,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:10,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:10:11,692 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:10:13,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:14,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=15.0 2023-09-30 13:10:15,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:10:17,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 13:10:17,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:10:19,203 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=722013.3333333334, ans=0.125 2023-09-30 13:10:20,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:10:20,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:10:26,189 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=722013.3333333334, ans=0.0 2023-09-30 13:10:30,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:30,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:33,854 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 13:10:35,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:37,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 13:10:38,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:41,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:43,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:43,524 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:10:43,690 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=722080.0, ans=0.1 2023-09-30 13:10:44,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:46,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:10:48,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:10:49,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:10:51,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:53,512 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:10:55,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:10:57,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:11:02,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:08,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:11:09,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 13:11:16,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:16,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:11:18,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:11:20,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 13:11:21,263 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=722213.3333333334, ans=0.0 2023-09-30 13:11:24,569 INFO [train.py:1039] (2/4) Epoch 21, batch 2100, loss[loss=0.1772, simple_loss=0.2559, pruned_loss=0.04925, over 24475.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2502, pruned_loss=0.05028, over 4711413.43 frames. ], batch size: 66, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:11:24,720 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 13:11:24,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:24,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:25,002 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=722280.0, ans=0.125 2023-09-30 13:11:26,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:26,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:27,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 13:11:28,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 13:11:29,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:32,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:11:32,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=722280.0, ans=0.125 2023-09-30 13:11:33,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:11:33,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:35,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:11:35,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 13:11:37,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:11:38,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 13:11:38,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 13:11:41,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:11:41,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:11:41,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 13:11:42,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:11:46,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 13:11:46,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:49,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:11:50,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:53,683 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.414e+02 1.855e+02 2.014e+02 2.188e+02 4.712e+02, threshold=4.028e+02, percent-clipped=1.0 2023-09-30 13:11:53,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:11:54,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=722346.6666666666, ans=0.0 2023-09-30 13:11:56,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 13:11:56,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=722413.3333333334, ans=0.125 2023-09-30 13:11:58,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:11:58,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 13:11:59,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 13:12:01,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:01,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 13:12:01,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 13:12:03,176 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 13:12:05,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:12:06,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:12:09,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:10,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:11,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:13,123 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:13,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 13:12:13,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:13,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:14,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:14,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 13:12:16,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 13:12:16,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 13:12:22,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:12:25,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:12:26,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 13:12:30,464 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.28 vs. limit=15.0 2023-09-30 13:12:32,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:36,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:12:36,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:12:36,405 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:12:36,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 13:12:36,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:12:38,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:38,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:12:40,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:12:40,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:41,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 13:12:43,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 13:12:43,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:45,445 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.25 vs. limit=22.5 2023-09-30 13:12:46,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:46,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:12:47,693 INFO [train.py:1039] (2/4) Epoch 21, batch 2150, loss[loss=0.1396, simple_loss=0.2129, pruned_loss=0.03315, over 24306.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.25, pruned_loss=0.05028, over 4707526.47 frames. ], batch size: 56, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:12:47,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:12:47,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:12:51,644 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.47 vs. limit=15.0 2023-09-30 13:12:52,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:12:54,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:55,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:57,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:12:57,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:12:57,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:13:02,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:04,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:13:04,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:13:07,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:07,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 13:13:13,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:13,209 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:13:14,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:14,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:16,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:16,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:13:16,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:17,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:13:17,883 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:13:19,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 13:13:20,856 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:13:20,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:21,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:24,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:13:24,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:13:26,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:27,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:13:29,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:29,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 13:13:29,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:13:32,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:32,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:34,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:37,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:13:39,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:41,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:41,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 13:13:43,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 13:13:43,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:13:43,430 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 13:13:43,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:44,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:13:45,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 13:13:45,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:13:45,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 13:13:45,167 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 13:13:45,168 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 13:13:46,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 13:13:49,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:50,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:51,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:13:51,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:52,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:13:54,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:54,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:03,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:14:03,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 13:14:08,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:14:09,585 INFO [train.py:1039] (2/4) Epoch 21, batch 2200, loss[loss=0.1751, simple_loss=0.2586, pruned_loss=0.04585, over 23998.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2502, pruned_loss=0.04996, over 4716497.64 frames. ], batch size: 86, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:14:10,635 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.91 vs. limit=12.0 2023-09-30 13:14:11,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=722946.6666666666, ans=0.0 2023-09-30 13:14:13,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:15,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:14:15,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:16,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:14:20,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:14:20,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:14:20,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 13:14:25,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 13:14:26,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:14:31,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 13:14:34,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:36,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:14:36,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:14:37,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:14:39,241 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.806e+02 1.948e+02 2.214e+02 3.228e+02, threshold=3.896e+02, percent-clipped=0.0 2023-09-30 13:14:39,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 13:14:42,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:14:44,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:46,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:14:50,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:14:52,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:14:55,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:14:56,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:58,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 13:14:59,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:01,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 13:15:03,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:03,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:15:04,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:06,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:15:06,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:06,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:07,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:09,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:15:09,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:15:10,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.69 vs. limit=10.0 2023-09-30 13:15:10,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:15:15,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:15:15,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:15:17,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:15:17,187 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 13:15:21,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:15:21,615 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 13:15:24,888 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.24 vs. limit=15.0 2023-09-30 13:15:25,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:15:25,703 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 13:15:26,652 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.23 vs. limit=10.0 2023-09-30 13:15:27,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:27,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:15:28,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:31,741 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 13:15:33,192 INFO [train.py:1039] (2/4) Epoch 21, batch 2250, loss[loss=0.1891, simple_loss=0.2603, pruned_loss=0.059, over 23660.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2503, pruned_loss=0.04991, over 4716349.74 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:15:33,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:15:34,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:36,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=723280.0, ans=0.1 2023-09-30 13:15:40,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=723280.0, ans=0.2 2023-09-30 13:15:40,074 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:15:41,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:15:41,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:15:45,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:46,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:15:46,265 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=723280.0, ans=0.125 2023-09-30 13:15:47,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:47,902 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=723346.6666666666, ans=0.1 2023-09-30 13:15:50,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 13:15:50,602 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:50,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:15:52,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 13:15:54,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:54,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:57,494 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:16:03,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:04,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:16:04,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:16:06,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 13:16:07,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:16:09,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:16:15,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:17,406 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=723413.3333333334, ans=0.2 2023-09-30 13:16:18,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:19,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:16:19,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:16:22,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:24,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:16:28,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:16:31,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:16:38,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:16:38,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:16:38,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:16:43,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:16:45,269 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.92 vs. limit=15.0 2023-09-30 13:16:46,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:16:46,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 13:16:47,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:47,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:16:50,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 13:16:51,049 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=723546.6666666666, ans=0.2 2023-09-30 13:16:53,627 INFO [train.py:1039] (2/4) Epoch 21, batch 2300, loss[loss=0.1671, simple_loss=0.2382, pruned_loss=0.04805, over 23460.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2505, pruned_loss=0.04956, over 4716195.61 frames. ], batch size: 134, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:16:53,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:16:53,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:17:03,537 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 13:17:05,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:05,208 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=723613.3333333334, ans=0.1 2023-09-30 13:17:13,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:17:13,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:17:15,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:16,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:16,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 13:17:16,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:17:18,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:19,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:17:22,901 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.842e+02 2.058e+02 2.392e+02 4.261e+02, threshold=4.115e+02, percent-clipped=2.0 2023-09-30 13:17:24,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:17:27,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:17:31,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:35,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=723746.6666666666, ans=0.0 2023-09-30 13:17:37,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:17:38,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:41,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:17:43,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=723813.3333333334, ans=0.1 2023-09-30 13:17:44,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:17:47,804 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=723813.3333333334, ans=0.125 2023-09-30 13:17:50,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:50,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:17:52,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:17:52,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 13:17:56,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:17:56,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:57,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:57,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:17:58,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:17:58,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:17:58,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:18:00,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 13:18:00,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:18:00,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:00,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 13:18:03,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=723880.0, ans=0.0 2023-09-30 13:18:06,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:18:09,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:18:09,785 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=723880.0, ans=0.2 2023-09-30 13:18:13,897 INFO [train.py:1039] (2/4) Epoch 21, batch 2350, loss[loss=0.1429, simple_loss=0.2202, pruned_loss=0.03279, over 24455.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2517, pruned_loss=0.04996, over 4718663.50 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:18:16,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:18:16,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:18:16,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:18:17,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:18:17,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:19,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:18:20,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 13:18:28,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:18:28,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 13:18:33,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 13:18:37,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:39,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:39,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:18:40,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 13:18:42,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:18:42,994 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=724013.3333333334, ans=0.125 2023-09-30 13:18:49,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 13:18:50,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:54,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:18:54,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:56,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:18:58,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 13:18:59,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:19:01,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:19:01,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:03,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:19:03,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=724146.6666666666, ans=0.1 2023-09-30 13:19:04,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:19:05,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=724146.6666666666, ans=0.0 2023-09-30 13:19:07,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 13:19:07,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:19:12,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:19:12,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:19:13,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 13:19:14,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:19:17,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 13:19:17,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:19:21,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 13:19:26,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 13:19:26,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:26,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:19:26,174 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 13:19:27,644 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 13:19:29,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 13:19:34,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:19:37,902 INFO [train.py:1039] (2/4) Epoch 21, batch 2400, loss[loss=0.1836, simple_loss=0.2638, pruned_loss=0.05169, over 23339.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2513, pruned_loss=0.0497, over 4709536.96 frames. ], batch size: 93, lr: 4.90e-03, grad_scale: 32.0 2023-09-30 13:19:38,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:19:41,331 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:19:44,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:19:45,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 13:19:45,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 13:19:54,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:19:54,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:19:55,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 13:19:55,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:19:55,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:19:56,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=724346.6666666666, ans=0.125 2023-09-30 13:19:57,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 13:20:03,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:03,545 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=724346.6666666666, ans=0.5 2023-09-30 13:20:06,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 13:20:07,652 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.817e+02 2.030e+02 2.238e+02 3.635e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 13:20:11,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=724413.3333333334, ans=0.0 2023-09-30 13:20:12,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:20:17,119 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 13:20:20,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:20,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:22,834 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.02 vs. limit=22.5 2023-09-30 13:20:24,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:25,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 13:20:26,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:20:27,007 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=724480.0, ans=0.1 2023-09-30 13:20:28,648 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=724480.0, ans=0.2 2023-09-30 13:20:34,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:35,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:20:39,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:20:40,819 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:20:40,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:20:40,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:20:40,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:40,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:20:40,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:20:47,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:20:47,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:20:47,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 13:20:49,111 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 13:20:52,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:52,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:52,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 13:20:53,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 13:20:53,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 13:20:53,897 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 13:20:55,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 13:20:55,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:58,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:58,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:20:58,967 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=724613.3333333334, ans=0.125 2023-09-30 13:20:59,986 INFO [train.py:1039] (2/4) Epoch 21, batch 2450, loss[loss=0.1728, simple_loss=0.2609, pruned_loss=0.04232, over 24484.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2505, pruned_loss=0.04925, over 4699963.56 frames. ], batch size: 69, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:21:00,115 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 13:21:00,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:02,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:21:04,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:21:06,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:21:09,499 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:09,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:11,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 13:21:18,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:21:18,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:21,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:21:21,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:21:21,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:21:22,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 13:21:27,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:27,915 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=724680.0, ans=0.125 2023-09-30 13:21:29,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:21:30,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:21:33,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:21:33,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:35,453 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=724746.6666666666, ans=0.2 2023-09-30 13:21:35,504 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:21:36,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:36,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:36,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=724746.6666666666, ans=0.0 2023-09-30 13:21:38,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 13:21:38,505 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=724746.6666666666, ans=0.0 2023-09-30 13:21:38,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=724746.6666666666, ans=0.125 2023-09-30 13:21:40,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:21:48,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:49,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=724813.3333333334, ans=0.1 2023-09-30 13:21:50,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:50,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:21:52,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:21:52,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:53,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:21:55,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 13:21:56,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=724813.3333333334, ans=0.0 2023-09-30 13:21:57,379 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.33 vs. limit=22.5 2023-09-30 13:21:58,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:58,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:22:02,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:02,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:04,937 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=724880.0, ans=0.125 2023-09-30 13:22:09,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:22:09,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 13:22:10,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:22:10,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:10,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 13:22:10,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:22:14,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:22:16,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:22:19,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:22:21,671 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:22:22,990 INFO [train.py:1039] (2/4) Epoch 21, batch 2500, loss[loss=0.1804, simple_loss=0.2527, pruned_loss=0.05407, over 23524.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2497, pruned_loss=0.04906, over 4699035.90 frames. ], batch size: 134, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:22:24,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 13:22:24,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:22:27,142 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=724946.6666666666, ans=0.1 2023-09-30 13:22:31,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:39,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:22:40,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:42,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:42,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 13:22:49,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:22:49,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:51,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:22:51,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:22:52,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 13:22:53,153 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:22:54,205 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 2.019e+02 2.359e+02 2.829e+02 4.327e+02, threshold=4.718e+02, percent-clipped=1.0 2023-09-30 13:22:54,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:56,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:56,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 13:22:56,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:57,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 13:22:57,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:03,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:23:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:23:06,411 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=725080.0, ans=0.125 2023-09-30 13:23:07,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:23:09,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 13:23:09,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:09,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:13,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:14,467 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.20 vs. limit=15.0 2023-09-30 13:23:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:19,004 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=725146.6666666666, ans=0.0 2023-09-30 13:23:21,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:25,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=725146.6666666666, ans=0.125 2023-09-30 13:23:28,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:23:30,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 13:23:30,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:23:32,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:23:33,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:23:33,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:23:35,283 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 13:23:35,284 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 13:23:35,292 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 13:23:38,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:41,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 13:23:41,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 13:23:41,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:44,788 INFO [train.py:1039] (2/4) Epoch 21, batch 2550, loss[loss=0.1803, simple_loss=0.2667, pruned_loss=0.04697, over 24469.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.25, pruned_loss=0.04925, over 4706998.70 frames. ], batch size: 69, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:23:44,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 13:23:46,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 13:23:49,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:49,849 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=725280.0, ans=0.125 2023-09-30 13:23:51,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:51,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:23:54,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:56,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 13:23:56,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:24:00,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 13:24:00,766 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-09-30 13:24:03,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:24:05,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:05,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:24:06,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 13:24:06,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:08,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:08,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:11,745 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:24:11,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 13:24:11,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:24:11,840 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:11,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 13:24:22,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=725413.3333333334, ans=0.04949747468305833 2023-09-30 13:24:24,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:24:32,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:32,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:32,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:32,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:24:39,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:41,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:41,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:24:43,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:24:43,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:24:43,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:24:46,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:48,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:51,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:24:51,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 13:24:51,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:24:51,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:52,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:24:54,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:24:55,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:24:55,982 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=725546.6666666666, ans=0.125 2023-09-30 13:25:04,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:25:07,611 INFO [train.py:1039] (2/4) Epoch 21, batch 2600, loss[loss=0.1844, simple_loss=0.2588, pruned_loss=0.05497, over 23359.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2509, pruned_loss=0.0496, over 4719204.65 frames. ], batch size: 105, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:25:07,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:09,941 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 13:25:11,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=725613.3333333334, ans=0.125 2023-09-30 13:25:12,749 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 13:25:12,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:25:12,827 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 13:25:12,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 13:25:14,397 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 13:25:16,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:25:16,258 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 13:25:17,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 13:25:19,290 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 13:25:20,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:25:21,053 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=725613.3333333334, ans=0.0 2023-09-30 13:25:24,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 13:25:25,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 13:25:25,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:25:27,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 13:25:29,518 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 13:25:29,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 13:25:37,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:37,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:37,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:37,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 13:25:39,011 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.842e+02 2.050e+02 2.225e+02 3.222e+02, threshold=4.100e+02, percent-clipped=0.0 2023-09-30 13:25:42,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:25:48,995 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 13:25:53,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:55,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:55,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 13:25:55,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:25:55,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:57,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 13:26:00,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:26:00,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:26:02,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:07,508 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 13:26:07,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:07,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:26:12,332 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=725880.0, ans=0.2 2023-09-30 13:26:14,371 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=725880.0, ans=0.1 2023-09-30 13:26:15,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:26:15,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:26:15,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 13:26:17,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:26:19,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:21,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:22,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=725880.0, ans=0.0 2023-09-30 13:26:27,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 13:26:27,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:28,822 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:26:30,213 INFO [train.py:1039] (2/4) Epoch 21, batch 2650, loss[loss=0.226, simple_loss=0.2837, pruned_loss=0.08418, over 19769.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.251, pruned_loss=0.04972, over 4719463.32 frames. ], batch size: 388, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:26:33,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 13:26:33,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:34,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:26:35,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 13:26:36,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:26:39,511 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=725946.6666666666, ans=0.0 2023-09-30 13:26:40,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:42,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:26:44,478 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.38 vs. limit=15.0 2023-09-30 13:26:45,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:46,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:48,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 13:26:48,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:26:48,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:26:50,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 13:26:54,466 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 13:26:56,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:58,052 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:26:59,224 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 13:26:59,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:00,787 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 13:27:02,444 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=726080.0, ans=0.125 2023-09-30 13:27:05,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:05,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:27:06,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:06,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:13,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 13:27:13,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 13:27:16,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:27:19,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 13:27:19,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:21,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:21,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:22,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:23,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:25,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:26,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:26,979 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=726146.6666666666, ans=0.0 2023-09-30 13:27:28,236 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:27:29,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:27:29,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:27:32,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:33,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:27:34,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:37,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:37,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:27:40,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:41,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:27:41,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:41,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 13:27:41,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=726213.3333333334, ans=0.125 2023-09-30 13:27:46,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:46,342 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:47,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:48,400 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.45 vs. limit=22.5 2023-09-30 13:27:49,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:51,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:51,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:51,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=726280.0, ans=0.0 2023-09-30 13:27:52,797 INFO [train.py:1039] (2/4) Epoch 21, batch 2700, loss[loss=0.1821, simple_loss=0.2643, pruned_loss=0.04997, over 24083.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.252, pruned_loss=0.04981, over 4726176.99 frames. ], batch size: 80, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:27:53,293 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=726280.0, ans=0.125 2023-09-30 13:27:55,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:27:55,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 13:27:59,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:28:01,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:28:02,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=726280.0, ans=0.125 2023-09-30 13:28:04,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:28:04,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:04,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:06,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:28:06,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:06,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:28:06,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:28:06,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 13:28:07,662 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:28:09,441 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=726346.6666666666, ans=0.1 2023-09-30 13:28:10,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:28:10,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:28:10,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:28:15,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:28:16,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 13:28:16,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:28:21,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:28:21,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:28:23,276 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.895e+02 2.064e+02 2.321e+02 3.195e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 13:28:27,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:28:27,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:28:27,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:28:27,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:28:31,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:34,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:34,147 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:28:34,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:28:39,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:39,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:28:39,512 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=726413.3333333334, ans=0.1 2023-09-30 13:28:45,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=726480.0, ans=0.0 2023-09-30 13:28:47,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=726480.0, ans=0.0 2023-09-30 13:28:48,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:28:48,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:50,642 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=726480.0, ans=0.125 2023-09-30 13:28:51,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:28:51,807 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:28:57,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:57,815 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.25 vs. limit=15.0 2023-09-30 13:28:58,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:58,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:58,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:01,041 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:29:01,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:04,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:29:04,466 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=726546.6666666666, ans=0.1 2023-09-30 13:29:05,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:05,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:09,831 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.59 vs. limit=15.0 2023-09-30 13:29:10,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 13:29:12,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:15,388 INFO [train.py:1039] (2/4) Epoch 21, batch 2750, loss[loss=0.1913, simple_loss=0.27, pruned_loss=0.05636, over 23780.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2525, pruned_loss=0.05001, over 4726734.25 frames. ], batch size: 85, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:29:15,592 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:29:15,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 13:29:15,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 13:29:15,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:17,589 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=726613.3333333334, ans=0.0 2023-09-30 13:29:20,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:21,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:23,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:23,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:29:23,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:28,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:29:28,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:29:28,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:29:30,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:30,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 13:29:30,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:29:30,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:36,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 13:29:37,338 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.16 vs. limit=6.0 2023-09-30 13:29:38,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:29:38,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:39,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:39,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:29:41,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:42,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:29:42,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:42,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:47,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:29:47,935 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:29:49,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:29:51,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:52,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:30:00,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:03,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:30:03,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:08,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:30:08,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:30:08,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:30:14,393 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=726813.3333333334, ans=0.125 2023-09-30 13:30:15,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:30:15,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:30:15,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 13:30:15,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=726813.3333333334, ans=0.2 2023-09-30 13:30:19,471 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=726813.3333333334, ans=0.2 2023-09-30 13:30:21,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:22,323 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:30:22,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=726880.0, ans=0.125 2023-09-30 13:30:23,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 13:30:26,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:30:29,899 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:30:29,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 13:30:31,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:30:32,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:30:34,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 13:30:35,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:30:37,940 INFO [train.py:1039] (2/4) Epoch 21, batch 2800, loss[loss=0.1687, simple_loss=0.2466, pruned_loss=0.04545, over 24352.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2515, pruned_loss=0.04965, over 4722356.22 frames. ], batch size: 61, lr: 4.89e-03, grad_scale: 32.0 2023-09-30 13:30:38,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:30:39,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:30:39,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:30:41,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 13:30:41,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:42,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:44,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:44,228 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 13:30:44,229 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 13:30:47,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:49,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:30:49,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:30:53,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:56,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 13:30:57,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:30:59,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 13:31:00,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:00,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:31:02,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:05,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:05,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:05,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:31:07,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:10,730 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.859e+02 2.240e+02 2.757e+02 3.972e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-30 13:31:15,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:31:17,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:31:21,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:22,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:31:24,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:28,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:28,101 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 13:31:28,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:29,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:29,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:31:34,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:35,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:37,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:39,884 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-09-30 13:31:40,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:31:40,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:40,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:31:42,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:31:42,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:31:44,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:44,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 13:31:44,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:46,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:46,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:47,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 13:31:47,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:47,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:31:48,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:31:49,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 13:31:56,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:57,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:31:58,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:32:00,190 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=727280.0, ans=0.2 2023-09-30 13:32:01,339 INFO [train.py:1039] (2/4) Epoch 21, batch 2850, loss[loss=0.173, simple_loss=0.2595, pruned_loss=0.04322, over 24017.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2497, pruned_loss=0.04908, over 4705609.17 frames. ], batch size: 80, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:32:01,446 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:06,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:06,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:06,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:32:09,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:09,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:32:10,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:32:12,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 13:32:20,274 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 13:32:20,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:21,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 13:32:23,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:25,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 13:32:27,136 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 13:32:29,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:40,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:42,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:42,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:43,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:32:43,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:32:43,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:32:45,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:32:45,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 13:32:47,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:32:47,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:32:48,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:48,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:52,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:52,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:53,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:55,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:57,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:32:58,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:59,282 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=727480.0, ans=0.125 2023-09-30 13:33:00,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:03,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:33:08,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:33:10,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 13:33:10,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 13:33:11,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:33:13,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:13,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 13:33:13,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:33:14,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:14,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:14,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:33:14,933 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 13:33:16,419 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 13:33:16,425 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:16,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:21,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:21,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:22,476 INFO [train.py:1039] (2/4) Epoch 21, batch 2900, loss[loss=0.1786, simple_loss=0.2552, pruned_loss=0.05098, over 23185.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2499, pruned_loss=0.0489, over 4719879.89 frames. ], batch size: 93, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:33:22,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:33:24,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 13:33:29,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:29,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 13:33:30,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 13:33:32,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:33:32,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:33:34,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:36,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:33:37,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=727613.3333333334, ans=0.125 2023-09-30 13:33:40,202 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:40,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:43,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:33:43,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 13:33:44,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:33:46,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:48,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 13:33:48,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 13:33:51,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:51,150 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 13:33:51,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:33:54,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:33:54,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:55,589 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.882e+02 2.103e+02 2.412e+02 3.503e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-30 13:33:57,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:58,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:34:04,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:34:05,745 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.58 vs. limit=15.0 2023-09-30 13:34:07,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:09,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 13:34:09,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 13:34:09,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:34:12,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=727813.3333333334, ans=0.125 2023-09-30 13:34:12,379 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=727813.3333333334, ans=0.1 2023-09-30 13:34:13,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:34:15,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 13:34:16,667 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:34:21,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:34:32,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:34:32,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:34:32,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 13:34:37,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:39,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 13:34:39,117 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:39,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:34:46,020 INFO [train.py:1039] (2/4) Epoch 21, batch 2950, loss[loss=0.1571, simple_loss=0.2431, pruned_loss=0.03555, over 24481.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2512, pruned_loss=0.04918, over 4723128.75 frames. ], batch size: 66, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:34:46,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:48,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 13:34:48,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:34:48,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:49,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:34:52,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:34:53,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 13:34:54,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 13:34:55,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:34:55,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:34:57,704 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:34:57,882 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=727946.6666666666, ans=0.1 2023-09-30 13:35:02,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:04,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:05,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:07,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:11,124 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=728013.3333333334, ans=0.125 2023-09-30 13:35:12,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:12,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:35:14,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:35:17,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 13:35:23,879 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 13:35:24,602 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 13:35:25,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:35:27,439 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 13:35:29,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 13:35:29,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:29,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:29,135 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 13:35:29,141 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:35:32,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 13:35:33,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:33,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:35:35,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:38,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:35:38,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:38,511 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 13:35:38,578 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:40,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 13:35:45,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:46,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:35:46,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 13:35:46,989 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:35:49,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 13:35:52,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:35:52,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:54,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:35:55,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:55,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:35:58,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:35:59,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:59,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:35:59,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:36:01,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:36:01,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:36:02,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:02,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 13:36:04,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:06,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:36:06,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:36:09,254 INFO [train.py:1039] (2/4) Epoch 21, batch 3000, loss[loss=0.1812, simple_loss=0.2504, pruned_loss=0.05596, over 23801.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2518, pruned_loss=0.04978, over 4728010.30 frames. ], batch size: 212, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:36:09,255 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 13:36:24,051 INFO [train.py:1071] (2/4) Epoch 21, validation: loss=0.3084, simple_loss=0.2796, pruned_loss=0.1686, over 1125622.00 frames. 2023-09-30 13:36:24,052 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 13:36:25,717 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 13:36:25,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 13:36:27,996 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=728280.0, ans=0.1 2023-09-30 13:36:28,136 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=728280.0, ans=0.05 2023-09-30 13:36:30,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:36:30,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:36:30,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 13:36:32,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:36:38,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:36:46,870 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=728346.6666666666, ans=0.2 2023-09-30 13:36:47,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:36:54,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 13:36:54,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:36:58,418 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.897e+02 2.038e+02 2.302e+02 2.961e+02, threshold=4.076e+02, percent-clipped=0.0 2023-09-30 13:37:01,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:37:01,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:37:01,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:04,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:04,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 13:37:05,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 13:37:08,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:37:08,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:37:11,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:37:11,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:11,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:11,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:37:14,147 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.83 vs. limit=6.0 2023-09-30 13:37:17,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:37:17,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:18,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:37:20,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:22,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 13:37:24,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:37:24,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:24,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:37:26,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:28,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:28,498 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=728546.6666666666, ans=0.125 2023-09-30 13:37:28,581 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=728546.6666666666, ans=0.0 2023-09-30 13:37:29,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:37:29,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 13:37:31,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:37:31,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 13:37:31,945 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:37:35,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 13:37:39,050 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:37:40,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:37:40,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 13:37:40,821 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 13:37:40,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:37:40,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:37:42,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:42,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:37:42,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:43,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:37:44,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=728546.6666666666, ans=0.2 2023-09-30 13:37:45,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 13:37:46,983 INFO [train.py:1039] (2/4) Epoch 21, batch 3050, loss[loss=0.1678, simple_loss=0.256, pruned_loss=0.03974, over 24433.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2522, pruned_loss=0.04953, over 4735612.23 frames. ], batch size: 69, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:37:47,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:37:47,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=728613.3333333334, ans=0.0 2023-09-30 13:37:50,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:51,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:37:56,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:59,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 13:38:04,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 13:38:06,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 13:38:06,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:11,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:38:16,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:16,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:16,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:19,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:20,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:38:20,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:22,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:22,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:23,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:25,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:27,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:27,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 13:38:28,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:28,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:38:33,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:38:34,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:38:34,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:38:34,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:40,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:41,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:43,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=728813.3333333334, ans=0.125 2023-09-30 13:38:48,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:49,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:38:49,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:51,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:52,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:38:52,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:54,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 13:38:55,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:55,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:57,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 13:38:58,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:06,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:07,746 INFO [train.py:1039] (2/4) Epoch 21, batch 3100, loss[loss=0.1551, simple_loss=0.2302, pruned_loss=0.04006, over 20728.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2516, pruned_loss=0.04924, over 4717153.69 frames. ], batch size: 45, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:39:09,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:39:10,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:39:11,148 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=728946.6666666666, ans=0.125 2023-09-30 13:39:13,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 13:39:14,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 13:39:17,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 13:39:17,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:39:21,744 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:39:21,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:25,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:39:30,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:34,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 13:39:39,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:39:39,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:39,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:39:39,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:39:40,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:39:42,282 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.906e+02 2.121e+02 2.536e+02 4.254e+02, threshold=4.242e+02, percent-clipped=1.0 2023-09-30 13:39:43,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:39:43,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 13:39:43,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:39:45,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:47,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 13:39:48,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:39:52,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:39:54,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 13:39:56,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 13:39:56,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:57,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:01,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:01,759 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.56 vs. limit=10.0 2023-09-30 13:40:02,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:02,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:40:04,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:40:04,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:40:05,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:40:05,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:05,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:05,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 13:40:12,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:40:12,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 13:40:15,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:40:15,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 13:40:15,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:16,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:16,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 13:40:22,311 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=729213.3333333334, ans=0.125 2023-09-30 13:40:28,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 13:40:29,336 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.95 vs. limit=15.0 2023-09-30 13:40:30,073 INFO [train.py:1039] (2/4) Epoch 21, batch 3150, loss[loss=0.1898, simple_loss=0.2741, pruned_loss=0.05271, over 24365.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2507, pruned_loss=0.04912, over 4726481.61 frames. ], batch size: 77, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:40:30,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:32,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:34,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:40:35,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:40:36,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 13:40:37,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:37,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:40:39,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 13:40:40,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:42,351 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 13:40:44,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 13:40:44,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:40:45,875 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 13:40:45,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:40:47,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 13:40:48,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 13:40:48,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 13:40:48,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:48,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:50,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:50,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 13:40:52,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:57,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:41:02,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 13:41:02,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:41:05,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=729413.3333333334, ans=0.125 2023-09-30 13:41:07,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:41:07,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:41:07,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 13:41:10,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 13:41:11,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:41:12,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:41:12,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:41:13,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:13,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:41:13,760 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=729413.3333333334, ans=0.025 2023-09-30 13:41:14,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:41:15,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:41:16,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 13:41:18,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:41:18,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:19,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:41:19,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:41:21,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 13:41:21,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=729480.0, ans=0.125 2023-09-30 13:41:22,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:24,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 13:41:24,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:26,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 13:41:26,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 13:41:28,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:41:29,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:31,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 13:41:31,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:41:32,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:35,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:41:37,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:37,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:41:42,923 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:41:43,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:45,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 13:41:50,689 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=729546.6666666666, ans=0.1 2023-09-30 13:41:51,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:41:51,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:41:52,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=729613.3333333334, ans=0.0 2023-09-30 13:41:53,353 INFO [train.py:1039] (2/4) Epoch 21, batch 3200, loss[loss=0.1977, simple_loss=0.2634, pruned_loss=0.06603, over 23812.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2489, pruned_loss=0.04883, over 4712135.27 frames. ], batch size: 179, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:41:56,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:58,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:41:58,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 13:42:00,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:42:06,630 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:42:10,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:42:18,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:42:28,690 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.936e+02 2.183e+02 2.546e+02 4.680e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 13:42:28,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 13:42:30,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:42:34,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 13:42:35,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:42:35,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=729746.6666666666, ans=0.1 2023-09-30 13:42:40,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:42:40,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:42:41,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:42:47,382 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 13:42:48,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:42:50,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 13:42:50,798 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=729813.3333333334, ans=0.125 2023-09-30 13:42:53,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 13:42:55,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:42:56,132 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:43:01,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:43:03,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,380 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 13:43:03,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:43:06,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:08,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 13:43:10,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 13:43:10,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 13:43:13,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 13:43:14,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:43:16,201 INFO [train.py:1039] (2/4) Epoch 21, batch 3250, loss[loss=0.1822, simple_loss=0.2724, pruned_loss=0.04602, over 24485.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2492, pruned_loss=0.04887, over 4711077.10 frames. ], batch size: 69, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:43:17,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:43:17,770 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 13:43:17,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:19,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:20,050 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 13:43:20,357 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=729946.6666666666, ans=0.125 2023-09-30 13:43:23,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:43:23,758 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=15.0 2023-09-30 13:43:26,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:43:34,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:43:34,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 13:43:36,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:36,239 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:36,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:39,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:39,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:43:42,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:42,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:43:42,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:44,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:43:47,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:49,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:50,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:50,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:52,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:53,220 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=730080.0, ans=0.09899494936611666 2023-09-30 13:43:54,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:54,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:43:59,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 13:44:00,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:44:00,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:44:01,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=730080.0, ans=0.0 2023-09-30 13:44:02,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:02,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:44:08,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:44:14,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:16,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:16,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 13:44:16,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:44:16,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:44:16,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:16,271 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=730146.6666666666, ans=0.125 2023-09-30 13:44:16,451 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=730146.6666666666, ans=0.0 2023-09-30 13:44:19,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 13:44:19,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 13:44:20,165 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=730213.3333333334, ans=0.0 2023-09-30 13:44:21,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:44:21,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:22,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:22,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:44:24,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:24,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=730213.3333333334, ans=0.0 2023-09-30 13:44:28,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:44:28,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:29,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 13:44:29,621 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:33,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:44:33,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 13:44:37,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:37,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 13:44:38,545 INFO [train.py:1039] (2/4) Epoch 21, batch 3300, loss[loss=0.2224, simple_loss=0.2787, pruned_loss=0.08306, over 19182.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2497, pruned_loss=0.04909, over 4700960.75 frames. ], batch size: 388, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:44:38,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 13:44:40,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 13:44:41,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:46,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:47,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:44:47,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:50,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:44:50,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:44:53,533 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=730346.6666666666, ans=0.0 2023-09-30 13:44:54,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:56,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:45:01,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 13:45:01,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:01,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:03,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:03,318 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 13:45:04,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:06,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:45:06,966 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.35 vs. limit=10.0 2023-09-30 13:45:07,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:45:07,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:09,809 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 13:45:13,420 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.837e+02 2.019e+02 2.374e+02 3.048e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 13:45:13,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:13,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:45:16,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:16,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 13:45:18,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 13:45:18,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:19,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:45:22,541 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 13:45:24,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 13:45:25,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:45:27,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 13:45:28,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:45:32,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:45:34,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:45:35,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:37,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:37,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:37,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:45:40,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:45:40,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:41,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=730480.0, ans=0.0 2023-09-30 13:45:42,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:45:43,852 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 13:45:45,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 13:45:45,715 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=730546.6666666666, ans=0.0 2023-09-30 13:45:47,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:45:47,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:45:47,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:49,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:49,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:50,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:45:51,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:52,404 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:45:53,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:54,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:45:56,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 13:45:57,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:58,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:59,868 INFO [train.py:1039] (2/4) Epoch 21, batch 3350, loss[loss=0.1781, simple_loss=0.2481, pruned_loss=0.05403, over 23220.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2502, pruned_loss=0.04934, over 4703924.36 frames. ], batch size: 119, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:46:00,825 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.37 vs. limit=15.0 2023-09-30 13:46:01,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:46:01,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:46:03,130 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=730613.3333333334, ans=0.0 2023-09-30 13:46:04,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:04,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:46:04,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:09,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:46:11,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:12,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:46:13,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:16,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:46:18,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:20,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:46:20,477 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=730680.0, ans=0.125 2023-09-30 13:46:21,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 13:46:23,802 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 13:46:23,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:27,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 13:46:27,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 13:46:28,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:46:28,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:46:30,450 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=730680.0, ans=0.125 2023-09-30 13:46:31,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:31,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 13:46:31,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:33,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:46:33,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=730746.6666666666, ans=0.2 2023-09-30 13:46:34,742 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:36,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:37,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:37,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:46:40,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:41,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=730746.6666666666, ans=0.1 2023-09-30 13:46:43,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:43,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:47,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:46:47,630 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.47 vs. limit=22.5 2023-09-30 13:46:48,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:51,294 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:51,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:52,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:56,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 13:46:56,573 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:46:56,625 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 13:46:56,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:46:58,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 13:46:58,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:00,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:47:00,392 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=730813.3333333334, ans=0.0 2023-09-30 13:47:05,295 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.47 vs. limit=6.0 2023-09-30 13:47:07,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:09,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 13:47:10,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:12,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:47:12,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:47:17,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:20,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 13:47:20,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:47:20,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:47:22,731 INFO [train.py:1039] (2/4) Epoch 21, batch 3400, loss[loss=0.1867, simple_loss=0.271, pruned_loss=0.05121, over 24347.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2512, pruned_loss=0.04971, over 4707082.54 frames. ], batch size: 77, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:47:24,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:25,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 13:47:27,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:27,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 13:47:29,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:47:31,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:47:31,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 13:47:36,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 13:47:36,435 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 13:47:36,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:47:40,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:40,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:42,433 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:47:43,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:47:49,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:47:50,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 13:47:58,451 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 2.005e+02 2.277e+02 4.714e+02, threshold=4.010e+02, percent-clipped=1.0 2023-09-30 13:47:58,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:47:58,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:00,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:00,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:48:00,513 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=731080.0, ans=0.125 2023-09-30 13:48:07,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:48:12,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 13:48:16,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:16,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:17,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 13:48:17,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:18,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:19,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:48:19,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:48:20,329 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=731146.6666666666, ans=0.125 2023-09-30 13:48:23,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:26,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:48:26,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:48:30,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:34,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 13:48:39,087 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=731213.3333333334, ans=0.0 2023-09-30 13:48:40,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:48:45,282 INFO [train.py:1039] (2/4) Epoch 21, batch 3450, loss[loss=0.1518, simple_loss=0.2307, pruned_loss=0.03649, over 20688.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2517, pruned_loss=0.05002, over 4707266.48 frames. ], batch size: 45, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:48:45,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 13:48:50,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 13:48:50,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:51,708 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:48:51,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 13:48:53,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:55,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:49:00,395 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=731346.6666666666, ans=0.0 2023-09-30 13:49:01,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:49:02,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:03,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:49:03,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:05,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:11,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 13:49:18,637 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 13:49:20,040 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:49:20,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:49:21,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:26,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 13:49:26,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:49:29,719 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=731413.3333333334, ans=0.2 2023-09-30 13:49:32,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:49:32,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:49:33,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:49:36,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:49:37,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=731480.0, ans=0.95 2023-09-30 13:49:39,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 13:49:39,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:49:41,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:41,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=731480.0, ans=0.0 2023-09-30 13:49:44,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:49:45,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 13:49:49,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:49:54,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=731546.6666666666, ans=0.0 2023-09-30 13:49:55,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:49:57,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:58,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:03,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:03,516 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:50:04,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:50:05,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:50:07,250 INFO [train.py:1039] (2/4) Epoch 21, batch 3500, loss[loss=0.1749, simple_loss=0.2396, pruned_loss=0.05511, over 23785.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2501, pruned_loss=0.05001, over 4704372.90 frames. ], batch size: 164, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:50:10,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:13,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:50:13,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 13:50:15,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:50:18,620 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 13:50:21,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:21,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 13:50:25,478 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=731680.0, ans=0.0 2023-09-30 13:50:28,264 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:50:29,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:50:29,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:50:29,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:29,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:50:31,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:31,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:31,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 13:50:34,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:35,933 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:50:37,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:41,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:43,333 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.924e+02 2.163e+02 2.586e+02 4.135e+02, threshold=4.325e+02, percent-clipped=1.0 2023-09-30 13:50:43,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 13:50:43,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:46,390 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:47,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:50:48,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:49,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:50:51,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:51,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 13:50:52,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 13:50:54,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 13:50:54,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:55,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:57,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:57,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:50:59,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:51:01,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:51:07,260 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:07,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 13:51:07,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 13:51:07,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:12,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:12,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:14,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:17,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 13:51:19,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:20,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:51:20,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 13:51:21,499 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.82 vs. limit=15.0 2023-09-30 13:51:23,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 13:51:25,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:25,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:25,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=731880.0, ans=0.04949747468305833 2023-09-30 13:51:26,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:26,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:28,356 INFO [train.py:1039] (2/4) Epoch 21, batch 3550, loss[loss=0.1546, simple_loss=0.1995, pruned_loss=0.05486, over 19190.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2481, pruned_loss=0.04889, over 4696179.99 frames. ], batch size: 389, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:51:30,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:51:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:43,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:51:45,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:45,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=732013.3333333334, ans=0.125 2023-09-30 13:51:48,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:51:50,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:50,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:51:50,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:51:54,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:55,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:51:57,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:57,353 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:51:58,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:52:03,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:52:03,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:52:06,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:06,858 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:52:06,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:52:08,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 13:52:08,905 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:10,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:12,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:52:16,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:18,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:52:18,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:21,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 13:52:21,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:52:23,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 13:52:25,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:26,860 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.42 vs. limit=15.0 2023-09-30 13:52:27,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:52:27,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:52:30,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 13:52:31,373 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.86 vs. limit=15.0 2023-09-30 13:52:32,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:38,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:39,929 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 13:52:40,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:44,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:46,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 13:52:51,313 INFO [train.py:1039] (2/4) Epoch 21, batch 3600, loss[loss=0.1744, simple_loss=0.2409, pruned_loss=0.05393, over 21107.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2485, pruned_loss=0.04867, over 4703006.23 frames. ], batch size: 46, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:52:51,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 13:52:52,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:52:53,135 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=732280.0, ans=0.125 2023-09-30 13:52:54,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:52:56,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:57,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:58,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:53:02,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:03,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:04,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:53:04,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=732280.0, ans=10.0 2023-09-30 13:53:05,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:53:07,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:07,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 13:53:10,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:53:11,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:16,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:19,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:21,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:53:21,559 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:21,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 13:53:23,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:24,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:26,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:53:27,436 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.948e+02 2.290e+02 2.674e+02 4.312e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-30 13:53:27,653 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:31,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:32,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:53:32,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 13:53:33,089 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:53:40,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:53:41,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:53:43,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 13:53:45,316 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=732480.0, ans=0.1 2023-09-30 13:53:46,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=732480.0, ans=0.2 2023-09-30 13:53:48,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:53:54,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:59,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:02,468 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=732546.6666666666, ans=0.05 2023-09-30 13:54:05,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:54:06,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:54:06,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 13:54:06,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 13:54:08,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 13:54:10,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:54:10,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:54:12,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 13:54:14,037 INFO [train.py:1039] (2/4) Epoch 21, batch 3650, loss[loss=0.1525, simple_loss=0.2313, pruned_loss=0.03681, over 24317.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2493, pruned_loss=0.04852, over 4713043.30 frames. ], batch size: 61, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:54:14,139 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:14,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:54:14,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:14,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 13:54:15,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 13:54:18,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:20,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 13:54:20,494 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=732613.3333333334, ans=0.0 2023-09-30 13:54:23,839 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=732613.3333333334, ans=0.1 2023-09-30 13:54:25,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 13:54:26,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:54:29,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 13:54:31,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 13:54:36,461 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:54:36,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:54:36,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:54:38,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:54:38,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:40,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 13:54:42,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:54:42,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:43,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 13:54:45,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:54:45,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:54:45,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:54:48,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:54:49,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=732746.6666666666, ans=0.0 2023-09-30 13:54:50,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 13:54:52,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 13:54:52,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:54:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 13:54:55,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:54:55,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:55:01,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:55:03,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:03,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:55:06,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:55:06,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:55:09,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:55:12,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:12,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:12,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:55:16,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:55:16,551 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:16,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:23,790 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 13:55:26,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:26,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:27,076 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:55:28,495 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:29,368 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=732880.0, ans=10.0 2023-09-30 13:55:29,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:55:31,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:34,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 13:55:34,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:35,780 INFO [train.py:1039] (2/4) Epoch 21, batch 3700, loss[loss=0.1907, simple_loss=0.2779, pruned_loss=0.05172, over 24550.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2502, pruned_loss=0.04914, over 4720655.00 frames. ], batch size: 71, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:55:37,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:55:41,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:41,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:55:44,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:44,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 13:55:44,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:45,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:55:45,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:55:49,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:55:54,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:55,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:55:56,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:55:56,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:57,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:56:00,745 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:00,909 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 13:56:08,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:56:08,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:56:10,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:56:10,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 13:56:10,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:11,740 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.839e+02 1.995e+02 2.258e+02 3.801e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 13:56:15,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:16,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 13:56:18,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:18,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:56:22,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:22,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:56:25,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:56:29,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:29,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 13:56:29,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:29,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 13:56:31,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=733146.6666666666, ans=0.125 2023-09-30 13:56:36,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:56:37,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:56:39,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:39,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 13:56:42,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:56:42,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:56:43,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:43,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:44,146 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=733213.3333333334, ans=0.125 2023-09-30 13:56:47,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:47,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 13:56:48,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 13:56:50,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:56:50,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:56:51,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:56:51,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:56:52,089 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=733213.3333333334, ans=0.125 2023-09-30 13:56:54,199 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=733213.3333333334, ans=0.125 2023-09-30 13:56:57,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:59,030 INFO [train.py:1039] (2/4) Epoch 21, batch 3750, loss[loss=0.2322, simple_loss=0.2947, pruned_loss=0.08487, over 19592.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2521, pruned_loss=0.04957, over 4723066.79 frames. ], batch size: 388, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:56:59,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:56:59,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:02,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 13:57:03,285 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=733280.0, ans=0.125 2023-09-30 13:57:04,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 13:57:07,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:57:07,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 13:57:08,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:57:09,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:10,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:12,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:15,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:18,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:57:18,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:57:18,767 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=733346.6666666666, ans=0.125 2023-09-30 13:57:21,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:57:24,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:24,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 13:57:24,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:27,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:27,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:27,517 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=733346.6666666666, ans=0.2 2023-09-30 13:57:32,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 13:57:35,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 13:57:37,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:38,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:40,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:44,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:46,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:57:50,119 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=733480.0, ans=0.2 2023-09-30 13:57:52,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 13:57:55,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:56,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=733480.0, ans=0.0 2023-09-30 13:57:59,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:59,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:58:02,945 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=733546.6666666666, ans=0.125 2023-09-30 13:58:04,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:58:06,583 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=733546.6666666666, ans=22.5 2023-09-30 13:58:08,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:58:09,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:58:11,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:58:13,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:58:15,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=733546.6666666666, ans=0.0 2023-09-30 13:58:17,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:58:20,406 INFO [train.py:1039] (2/4) Epoch 21, batch 3800, loss[loss=0.1847, simple_loss=0.2709, pruned_loss=0.04923, over 24694.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2521, pruned_loss=0.04927, over 4736194.61 frames. ], batch size: 73, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:58:25,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:58:28,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:29,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:58:31,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 13:58:32,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:36,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:36,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:58:40,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:58:40,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:40,655 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:58:42,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:43,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:58:43,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:43,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 13:58:47,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:58:48,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:58:50,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=733680.0, ans=0.125 2023-09-30 13:58:53,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:55,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:58:56,640 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:58:58,104 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.864e+02 2.186e+02 2.612e+02 3.955e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 13:58:58,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:58:58,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:59,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:01,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:59:06,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:59:06,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 13:59:06,270 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=733746.6666666666, ans=0.125 2023-09-30 13:59:07,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:15,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:21,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:59:22,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 13:59:24,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 13:59:25,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:59:25,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=733880.0, ans=0.125 2023-09-30 13:59:28,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:30,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:31,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 13:59:34,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 13:59:34,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 13:59:34,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:36,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:41,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:59:41,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:59:43,081 INFO [train.py:1039] (2/4) Epoch 21, batch 3850, loss[loss=0.1809, simple_loss=0.2689, pruned_loss=0.04649, over 24442.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2511, pruned_loss=0.04915, over 4731041.25 frames. ], batch size: 69, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:59:48,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:59:50,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 13:59:50,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:59:52,287 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:55,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:59:59,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:01,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:00:01,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 14:00:03,272 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=734013.3333333334, ans=0.0 2023-09-30 14:00:09,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:10,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:00:14,227 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=734080.0, ans=10.0 2023-09-30 14:00:15,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:15,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:00:18,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:19,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:00:21,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:21,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:00:23,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:24,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:26,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:26,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:00:26,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 14:00:26,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 14:00:27,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:27,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:31,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:31,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:31,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 14:00:34,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 14:00:36,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:37,308 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.50 vs. limit=22.5 2023-09-30 14:00:39,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 14:00:40,245 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.30 vs. limit=6.0 2023-09-30 14:00:40,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 14:00:46,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:47,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:50,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:52,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 14:00:54,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 14:00:59,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:59,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:03,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:01:03,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:01:03,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:03,960 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.45 vs. limit=15.0 2023-09-30 14:01:04,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:04,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:01:04,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 14:01:04,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:01:06,259 INFO [train.py:1039] (2/4) Epoch 21, batch 3900, loss[loss=0.1795, simple_loss=0.264, pruned_loss=0.04749, over 24643.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2503, pruned_loss=0.04874, over 4725099.60 frames. ], batch size: 68, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:01:07,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 14:01:07,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:07,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:09,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:01:09,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:11,063 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:01:11,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:11,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:01:12,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:12,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 14:01:12,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:14,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:15,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:16,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:01:17,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:22,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:22,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:22,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=734346.6666666666, ans=0.0 2023-09-30 14:01:23,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:01:26,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 14:01:27,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:28,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 14:01:28,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:29,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=734346.6666666666, ans=0.125 2023-09-30 14:01:31,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 14:01:31,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 14:01:38,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:39,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:39,633 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:01:41,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:01:44,191 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.861e+02 2.057e+02 2.419e+02 3.679e+02, threshold=4.115e+02, percent-clipped=0.0 2023-09-30 14:01:44,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:47,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:01:48,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:01:48,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:01:50,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:01:50,676 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=734413.3333333334, ans=0.0 2023-09-30 14:01:57,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:57,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:01:57,771 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=734480.0, ans=0.125 2023-09-30 14:02:04,913 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=734480.0, ans=0.125 2023-09-30 14:02:05,493 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.05 vs. limit=15.0 2023-09-30 14:02:06,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:02:06,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:02:17,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:19,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:21,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 14:02:21,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 14:02:22,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:22,906 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 14:02:24,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:02:25,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 14:02:28,978 INFO [train.py:1039] (2/4) Epoch 21, batch 3950, loss[loss=0.1708, simple_loss=0.2572, pruned_loss=0.04222, over 24291.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.25, pruned_loss=0.04833, over 4722968.84 frames. ], batch size: 74, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:02:29,375 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:02:31,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=734613.3333333334, ans=0.125 2023-09-30 14:02:32,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:02:34,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 14:02:34,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:02:38,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:02:41,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:02:45,364 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 14:02:46,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:46,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 14:02:47,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=734680.0, ans=0.05 2023-09-30 14:02:48,837 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 14:02:48,877 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:51,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:51,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:02:51,924 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:55,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 14:02:56,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:02:58,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:58,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:02:58,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:02:59,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:03:11,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:03:11,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:03:16,725 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 14:03:16,896 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=734746.6666666666, ans=0.125 2023-09-30 14:03:24,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 14:03:24,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 14:03:24,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:03:27,771 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:03:35,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:03:35,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:03:37,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:03:37,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:03:37,550 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 14:03:40,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:03:42,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:03:47,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 14:03:52,526 INFO [train.py:1039] (2/4) Epoch 21, batch 4000, loss[loss=0.1763, simple_loss=0.2517, pruned_loss=0.0504, over 23678.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2502, pruned_loss=0.04858, over 4708863.03 frames. ], batch size: 149, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 14:03:53,650 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.74 vs. limit=10.0 2023-09-30 14:03:56,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=734946.6666666666, ans=0.125 2023-09-30 14:03:59,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:04,299 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=734946.6666666666, ans=0.0 2023-09-30 14:04:05,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:12,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:12,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:04:13,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:13,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 14:04:15,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:04:15,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 14:04:15,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:04:15,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 14:04:18,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:21,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:04:21,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:04:21,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:04:21,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:21,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:04:24,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:04:26,328 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 14:04:28,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:04:28,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:29,857 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.867e+02 2.046e+02 2.259e+02 3.289e+02, threshold=4.093e+02, percent-clipped=0.0 2023-09-30 14:04:32,340 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 14:04:33,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:04:33,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:38,479 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 14:04:40,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:41,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:04:43,149 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 14:04:44,581 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:04:44,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 14:04:44,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:04:46,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:48,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:04:49,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:04:49,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:04:49,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:51,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=735146.6666666666, ans=0.1 2023-09-30 14:04:52,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 14:04:52,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:55,977 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 14:04:59,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:05:03,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 14:05:06,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:05:07,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:08,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:05:09,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:14,503 INFO [train.py:1039] (2/4) Epoch 21, batch 4050, loss[loss=0.1687, simple_loss=0.2399, pruned_loss=0.04871, over 23288.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2503, pruned_loss=0.04844, over 4724020.62 frames. ], batch size: 119, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:05:16,127 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:17,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:05:19,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 14:05:20,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:05:22,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:05:22,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:05:24,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:26,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:29,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:32,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:05:33,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 14:05:35,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:05:35,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:05:40,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:43,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:46,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 14:05:46,698 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:05:47,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 14:05:48,002 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 14:05:49,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:05:57,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 14:05:57,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:06:00,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:03,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:06:05,468 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:06:05,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:08,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:06:14,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 14:06:14,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:06:16,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:17,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 14:06:21,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:28,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 14:06:30,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:06:30,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:06:31,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 14:06:31,994 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 14:06:31,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:35,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:06:36,982 INFO [train.py:1039] (2/4) Epoch 21, batch 4100, loss[loss=0.1596, simple_loss=0.2374, pruned_loss=0.04088, over 24656.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2509, pruned_loss=0.04908, over 4718411.91 frames. ], batch size: 65, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:06:37,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:37,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:06:44,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 14:06:47,824 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 14:06:48,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 14:06:49,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 14:06:49,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:51,456 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:51,505 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:52,957 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:06:53,090 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 14:06:53,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=735680.0, ans=0.1 2023-09-30 14:06:57,586 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:06:57,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:06:57,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:57,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:07:01,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:07:02,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:07:04,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:07:04,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 14:07:04,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:04,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:07:06,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:06,449 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:07:07,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 14:07:09,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:11,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 14:07:12,759 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:07:12,927 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=735746.6666666666, ans=0.2 2023-09-30 14:07:16,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:16,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 14:07:18,229 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.830e+02 1.986e+02 2.444e+02 3.912e+02, threshold=3.973e+02, percent-clipped=0.0 2023-09-30 14:07:19,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:07:19,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:07:21,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:07:22,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 14:07:24,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:07:26,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:07:29,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 14:07:29,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:29,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:07:34,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:36,583 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.00 vs. limit=12.0 2023-09-30 14:07:40,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:07:43,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:45,422 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:07:52,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:07:52,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:56,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:56,372 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=735880.0, ans=0.1 2023-09-30 14:07:57,665 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:08:00,530 INFO [train.py:1039] (2/4) Epoch 21, batch 4150, loss[loss=0.1883, simple_loss=0.262, pruned_loss=0.05728, over 23347.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2514, pruned_loss=0.04936, over 4714038.61 frames. ], batch size: 93, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:08:02,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:08:04,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:08:05,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:08:05,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:08,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 14:08:08,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:10,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 14:08:10,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 14:08:11,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 14:08:12,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:17,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:08:17,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:21,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:22,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:08:23,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:08:23,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=736013.3333333334, ans=0.05 2023-09-30 14:08:25,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:08:25,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:27,293 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:08:30,740 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:08:31,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:35,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:38,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 14:08:40,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 14:08:40,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:08:42,319 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.13 vs. limit=22.5 2023-09-30 14:08:42,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 14:08:42,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:08:42,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:08:44,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:08:46,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:50,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.20 vs. limit=15.0 2023-09-30 14:08:51,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 14:08:54,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:08:55,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:08:57,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 14:08:57,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:59,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 14:09:02,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:09:02,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:09:04,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:05,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 14:09:05,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:05,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:09:08,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:09:10,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 14:09:11,073 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:11,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:09:11,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:09:12,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 14:09:12,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:09:12,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:09:14,064 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:09:14,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:14,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 14:09:14,547 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=736213.3333333334, ans=0.125 2023-09-30 14:09:15,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:09:20,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:09:22,186 INFO [train.py:1039] (2/4) Epoch 21, batch 4200, loss[loss=0.1592, simple_loss=0.2093, pruned_loss=0.05453, over 19542.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2508, pruned_loss=0.04911, over 4711073.46 frames. ], batch size: 388, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:09:22,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 14:09:22,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=736280.0, ans=0.1 2023-09-30 14:09:23,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:09:26,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:28,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:09:28,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:28,643 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:32,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 14:09:35,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 14:09:37,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:38,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:42,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:09:45,568 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=736346.6666666666, ans=0.125 2023-09-30 14:09:46,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:09:46,958 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:09:47,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:48,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 14:09:48,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:50,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:51,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:51,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:09:53,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:09:54,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 14:09:54,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:10:00,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:10:00,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:10:01,430 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.766e+02 1.995e+02 2.283e+02 3.415e+02, threshold=3.990e+02, percent-clipped=0.0 2023-09-30 14:10:01,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:10:04,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:10:06,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:10:06,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 14:10:07,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:09,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:10:11,226 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=736480.0, ans=0.2 2023-09-30 14:10:12,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:10:14,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:22,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:10:23,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 14:10:26,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:27,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=736546.6666666666, ans=0.125 2023-09-30 14:10:29,686 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.23 vs. limit=15.0 2023-09-30 14:10:33,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:10:33,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:34,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 14:10:41,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:10:45,489 INFO [train.py:1039] (2/4) Epoch 21, batch 4250, loss[loss=0.1556, simple_loss=0.24, pruned_loss=0.03556, over 24435.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2499, pruned_loss=0.04891, over 4716064.78 frames. ], batch size: 63, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:10:47,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:47,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:10:50,017 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.54 vs. limit=15.0 2023-09-30 14:10:51,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:56,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:10:56,693 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 14:10:56,750 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:11:01,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:06,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:09,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:09,400 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:12,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:11:12,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:13,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:16,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:17,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:19,224 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:11:19,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:19,667 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:11:21,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 14:11:25,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 14:11:25,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:25,309 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:26,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:26,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:11:26,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:26,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:27,349 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=736746.6666666666, ans=0.1 2023-09-30 14:11:30,097 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=736746.6666666666, ans=0.125 2023-09-30 14:11:31,339 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:11:32,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:11:33,231 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=736813.3333333334, ans=0.0 2023-09-30 14:11:37,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:11:39,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:40,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 14:11:41,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:11:41,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 14:11:42,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:11:44,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:11:45,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:45,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:49,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 14:11:51,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:11:52,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:11:57,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:59,665 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=736880.0, ans=0.125 2023-09-30 14:12:00,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:02,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:12:02,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:12:03,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:05,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:12:06,787 INFO [train.py:1039] (2/4) Epoch 21, batch 4300, loss[loss=0.1779, simple_loss=0.2581, pruned_loss=0.04888, over 24022.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.249, pruned_loss=0.04884, over 4709862.83 frames. ], batch size: 80, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:12:06,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:06,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 14:12:09,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:12,419 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=736946.6666666666, ans=0.2 2023-09-30 14:12:13,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:15,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:19,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:29,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:29,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 14:12:29,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:12:33,424 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:12:33,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:12:33,493 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 14:12:36,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:12:38,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:12:41,217 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 14:12:41,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:12:42,580 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 14:12:44,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:12:44,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=737080.0, ans=0.1 2023-09-30 14:12:45,600 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.887e+02 2.170e+02 2.542e+02 3.657e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 14:12:45,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:12:49,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:12:49,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:51,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:12:52,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:12:52,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:52,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=737080.0, ans=0.125 2023-09-30 14:12:54,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 14:12:54,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 14:12:57,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:12:59,250 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=737146.6666666666, ans=0.1 2023-09-30 14:13:01,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:13:01,074 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:13:01,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 14:13:02,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 14:13:02,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 14:13:04,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:04,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 14:13:06,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 14:13:06,957 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=737146.6666666666, ans=0.125 2023-09-30 14:13:09,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:11,273 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 14:13:11,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:13:14,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:14,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:15,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 14:13:17,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:13:17,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:19,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:13:19,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:20,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:13:22,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:13:25,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:25,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:25,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:28,545 INFO [train.py:1039] (2/4) Epoch 21, batch 4350, loss[loss=0.1917, simple_loss=0.2599, pruned_loss=0.06175, over 23577.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2505, pruned_loss=0.04914, over 4724529.33 frames. ], batch size: 232, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:13:31,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 14:13:31,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:13:37,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:41,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:44,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:13:44,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:13:49,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:13:54,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:54,889 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=737346.6666666666, ans=0.125 2023-09-30 14:13:56,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:13:57,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:00,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:14:02,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:14:02,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:14:07,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 14:14:08,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:08,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:13,079 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=737413.3333333334, ans=0.125 2023-09-30 14:14:15,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:18,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 14:14:22,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:23,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:14:28,127 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 14:14:31,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:31,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:14:33,240 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 14:14:33,338 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 14:14:33,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:34,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:35,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:14:36,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:38,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:38,128 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:41,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 14:14:41,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:41,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:41,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:42,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 14:14:44,312 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 14:14:44,319 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 14:14:44,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 14:14:46,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:14:48,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:14:48,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:14:48,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:14:49,841 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=737546.6666666666, ans=0.05 2023-09-30 14:14:52,187 INFO [train.py:1039] (2/4) Epoch 21, batch 4400, loss[loss=0.1793, simple_loss=0.2496, pruned_loss=0.05451, over 23800.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2513, pruned_loss=0.04967, over 4718463.22 frames. ], batch size: 150, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:14:52,239 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 14:14:53,707 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 14:14:53,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:58,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:14:58,567 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:00,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:15:01,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 14:15:01,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 14:15:03,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 14:15:03,284 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 14:15:03,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:15:04,180 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.86 vs. limit=15.0 2023-09-30 14:15:05,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:15:07,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 14:15:08,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:10,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:10,216 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 14:15:14,751 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:14,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 14:15:14,830 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 14:15:18,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 14:15:19,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 14:15:19,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 14:15:20,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:20,158 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:21,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:21,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:23,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 14:15:23,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 14:15:24,683 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=737746.6666666666, ans=0.0 2023-09-30 14:15:25,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:27,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:15:27,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:30,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:30,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:30,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 14:15:30,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=737746.6666666666, ans=0.125 2023-09-30 14:15:31,466 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.966e+02 2.237e+02 2.534e+02 3.532e+02, threshold=4.474e+02, percent-clipped=0.0 2023-09-30 14:15:31,651 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 14:15:34,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:36,546 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=737746.6666666666, ans=0.0 2023-09-30 14:15:36,579 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=737746.6666666666, ans=0.125 2023-09-30 14:15:38,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=737746.6666666666, ans=0.1 2023-09-30 14:15:38,873 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=737746.6666666666, ans=0.125 2023-09-30 14:15:43,118 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:44,733 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 14:15:46,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=737813.3333333334, ans=0.025 2023-09-30 14:15:48,656 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.96 vs. limit=22.5 2023-09-30 14:15:49,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:15:50,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:15:56,249 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:15:56,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 14:15:58,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:15:58,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:15:58,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:15:58,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:16:02,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 14:16:05,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 14:16:07,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=737880.0, ans=0.125 2023-09-30 14:16:08,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 14:16:08,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:08,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 14:16:08,548 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:16:11,727 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:16:13,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 14:16:15,282 INFO [train.py:1039] (2/4) Epoch 21, batch 4450, loss[loss=0.1513, simple_loss=0.2261, pruned_loss=0.03828, over 17161.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2522, pruned_loss=0.04949, over 4725759.04 frames. ], batch size: 36, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:16:17,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:16:18,926 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=737946.6666666666, ans=0.125 2023-09-30 14:16:20,105 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:20,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:16:25,143 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:16:25,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:16:30,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:33,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:16:34,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:16:35,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:38,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 14:16:38,508 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:38,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=738013.3333333334, ans=0.125 2023-09-30 14:16:39,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:40,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:16:40,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:16:41,722 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:16:48,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:48,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:49,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:50,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=738080.0, ans=0.125 2023-09-30 14:16:51,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:53,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:16:57,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:16:58,875 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 14:17:00,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 14:17:00,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:17:02,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:03,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 14:17:07,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:17:10,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:12,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 14:17:14,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:14,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:14,213 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:17:14,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:14,499 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=738146.6666666666, ans=0.0 2023-09-30 14:17:15,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:20,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:17:21,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 14:17:22,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:17:24,281 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:17:25,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:27,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:27,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:17:30,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:17:32,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 14:17:33,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:17:37,158 INFO [train.py:1039] (2/4) Epoch 21, batch 4500, loss[loss=0.1851, simple_loss=0.2582, pruned_loss=0.05605, over 23273.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2531, pruned_loss=0.04969, over 4720877.39 frames. ], batch size: 105, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:17:39,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:41,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 14:17:41,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 14:17:41,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=738280.0, ans=0.0 2023-09-30 14:17:43,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:17:48,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:50,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:52,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:17:52,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:17:53,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:17:54,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:17:57,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=738346.6666666666, ans=0.07 2023-09-30 14:18:05,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:18:06,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:18:09,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:10,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:18:11,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:18:16,928 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:18:18,455 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.886e+02 2.149e+02 2.495e+02 4.486e+02, threshold=4.299e+02, percent-clipped=1.0 2023-09-30 14:18:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:18:25,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:18:26,514 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.90 vs. limit=15.0 2023-09-30 14:18:28,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:18:30,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 14:18:31,969 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:32,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:35,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=738480.0, ans=0.0 2023-09-30 14:18:36,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:18:36,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 14:18:36,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:18:36,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:41,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:18:41,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:18:45,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:47,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:18:47,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:18:48,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 14:18:52,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 14:18:52,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 14:18:52,707 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.97 vs. limit=22.5 2023-09-30 14:18:55,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 14:18:59,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 14:19:00,977 INFO [train.py:1039] (2/4) Epoch 21, batch 4550, loss[loss=0.1572, simple_loss=0.2159, pruned_loss=0.04928, over 22713.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2522, pruned_loss=0.04956, over 4722988.71 frames. ], batch size: 322, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:19:01,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:05,659 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:05,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:08,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:13,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:19:13,851 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=738613.3333333334, ans=0.125 2023-09-30 14:19:14,047 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=738613.3333333334, ans=0.125 2023-09-30 14:19:15,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:19:16,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:16,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:19:16,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:21,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:21,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:25,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:19:28,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 14:19:28,251 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 14:19:29,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:19:32,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 14:19:37,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 14:19:37,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:38,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 14:19:40,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:19:40,910 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=738746.6666666666, ans=0.125 2023-09-30 14:19:43,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,583 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:19:46,527 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 14:19:48,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:19:49,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:51,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:52,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=738813.3333333334, ans=0.125 2023-09-30 14:19:53,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:56,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 14:19:57,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 14:19:57,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:19:58,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 14:19:58,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=738813.3333333334, ans=0.0 2023-09-30 14:20:00,082 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 14:20:01,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:20:01,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:01,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:03,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:03,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:20:05,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:20:06,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 14:20:08,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:20:08,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:20:08,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 14:20:08,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:20:08,821 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=738880.0, ans=0.0 2023-09-30 14:20:09,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 14:20:13,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:20:13,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:20:16,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:20:17,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:17,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:20:19,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:20:20,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:20:20,934 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=738946.6666666666, ans=0.035 2023-09-30 14:20:22,176 INFO [train.py:1039] (2/4) Epoch 21, batch 4600, loss[loss=0.1746, simple_loss=0.2438, pruned_loss=0.05271, over 23816.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2503, pruned_loss=0.04915, over 4716471.98 frames. ], batch size: 164, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:20:23,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:23,924 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:27,491 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:20:27,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:20:29,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:31,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 14:20:34,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:20:35,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738946.6666666666, ans=0.1 2023-09-30 14:20:37,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:20:37,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:39,193 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=739013.3333333334, ans=0.0 2023-09-30 14:20:40,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:47,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 14:20:48,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:50,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:52,853 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.78 vs. limit=22.5 2023-09-30 14:20:55,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:20:55,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:21:00,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 14:21:00,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:21:02,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:03,435 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.897e+02 2.077e+02 2.452e+02 3.334e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 14:21:08,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:08,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:21:10,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:21:17,122 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 14:21:18,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:21:21,222 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.99 vs. limit=15.0 2023-09-30 14:21:21,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:23,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:21:26,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:26,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 14:21:26,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:28,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 14:21:28,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:28,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:29,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:31,069 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:21:31,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:32,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 14:21:32,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 14:21:34,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 14:21:34,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:34,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:36,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:37,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:39,774 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=739213.3333333334, ans=0.125 2023-09-30 14:21:44,503 INFO [train.py:1039] (2/4) Epoch 21, batch 4650, loss[loss=0.161, simple_loss=0.235, pruned_loss=0.04348, over 23562.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2498, pruned_loss=0.04878, over 4716180.51 frames. ], batch size: 134, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:21:48,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:21:51,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:51,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:52,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:21:52,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:52,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:52,968 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:57,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 14:22:00,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:22:00,685 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=739346.6666666666, ans=0.125 2023-09-30 14:22:02,114 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 14:22:02,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:22:03,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 14:22:03,691 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:22:05,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 14:22:05,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 14:22:05,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:05,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:22:10,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:22:11,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:11,977 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 14:22:14,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:15,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 14:22:19,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:19,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:22:19,591 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 14:22:22,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:22:25,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:22:25,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=739413.3333333334, ans=0.0 2023-09-30 14:22:28,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:35,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:38,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:39,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:41,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:22:44,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 14:22:44,844 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 14:22:46,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 14:22:46,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 14:22:46,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=739480.0, ans=0.035 2023-09-30 14:22:47,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:22:55,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:22:55,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:22:56,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 14:22:56,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:58,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:22:58,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:22:59,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:23:00,334 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=739546.6666666666, ans=0.125 2023-09-30 14:23:03,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:23:03,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:23:05,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:23:06,496 INFO [train.py:1039] (2/4) Epoch 21, batch 4700, loss[loss=0.1763, simple_loss=0.2481, pruned_loss=0.05226, over 23405.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2503, pruned_loss=0.04912, over 4719034.17 frames. ], batch size: 134, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:23:08,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:09,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:23:09,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:23:09,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 14:23:10,018 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=739613.3333333334, ans=0.125 2023-09-30 14:23:11,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:23:12,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 14:23:16,766 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-09-30 14:23:19,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:21,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:21,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:23:22,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:24,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:23:26,216 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=15.0 2023-09-30 14:23:30,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 14:23:30,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 14:23:35,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:36,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:23:36,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:23:39,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:40,091 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=739746.6666666666, ans=0.125 2023-09-30 14:23:46,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:23:46,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:23:46,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=739746.6666666666, ans=0.125 2023-09-30 14:23:47,469 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.882e+02 2.021e+02 2.237e+02 3.621e+02, threshold=4.043e+02, percent-clipped=0.0 2023-09-30 14:23:49,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:56,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 14:23:56,166 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:23:57,895 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=739813.3333333334, ans=0.2 2023-09-30 14:23:57,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=739813.3333333334, ans=0.0 2023-09-30 14:23:59,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:04,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 14:24:05,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:24:11,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:24:12,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 14:24:14,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:14,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:15,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:24:17,300 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:24:17,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 14:24:18,898 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 14:24:20,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:20,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=739880.0, ans=0.125 2023-09-30 14:24:21,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 14:24:23,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:28,522 INFO [train.py:1039] (2/4) Epoch 21, batch 4750, loss[loss=0.1725, simple_loss=0.2573, pruned_loss=0.04381, over 24476.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2505, pruned_loss=0.04881, over 4727693.71 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:24:28,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 14:24:30,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:24:30,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=739946.6666666666, ans=0.125 2023-09-30 14:24:30,618 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=739946.6666666666, ans=0.0 2023-09-30 14:24:31,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:24:37,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 14:24:37,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:24:43,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 14:24:44,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=740013.3333333334, ans=0.125 2023-09-30 14:24:45,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:24:45,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:47,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:24:50,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=740013.3333333334, ans=0.0 2023-09-30 14:24:52,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 14:24:58,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:24:59,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 14:24:59,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:05,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:05,011 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:05,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:06,530 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 14:25:06,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 14:25:06,836 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=740080.0, ans=0.0 2023-09-30 14:25:12,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 14:25:14,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:18,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:18,832 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=740146.6666666666, ans=0.125 2023-09-30 14:25:20,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:25:20,678 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 14:25:20,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:23,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:25:25,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:25:26,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 14:25:26,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 14:25:28,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:28,329 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:25:28,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=740146.6666666666, ans=0.1 2023-09-30 14:25:29,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:29,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:25:29,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 14:25:33,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 14:25:36,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:25:38,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:25:39,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 14:25:40,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:40,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:41,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:25:43,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:44,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:25:46,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:46,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 14:25:47,385 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=740213.3333333334, ans=0.1 2023-09-30 14:25:48,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 14:25:49,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 14:25:50,351 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=740280.0, ans=0.0 2023-09-30 14:25:51,997 INFO [train.py:1039] (2/4) Epoch 21, batch 4800, loss[loss=0.1592, simple_loss=0.2401, pruned_loss=0.03911, over 24473.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2517, pruned_loss=0.04948, over 4710968.22 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:25:53,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:25:53,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:55,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 14:26:00,022 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:01,523 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:06,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:26:07,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:08,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:08,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 14:26:11,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:26:11,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:26:11,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:26:15,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:16,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:16,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:26:18,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:18,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:26:18,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:20,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:22,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=740346.6666666666, ans=0.2 2023-09-30 14:26:24,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:27,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:27,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=740413.3333333334, ans=0.0 2023-09-30 14:26:28,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:28,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:26:30,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:26:31,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:32,878 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.899e+02 2.131e+02 2.497e+02 3.417e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-30 14:26:34,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 14:26:34,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 14:26:35,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:36,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:26:36,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:26:36,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:36,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:26:37,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:26:39,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:43,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:26:47,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:48,934 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:26:55,486 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 14:26:55,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:55,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:55,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:26:57,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:27:00,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:27:02,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:27:02,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:02,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:27:03,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:27:05,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:27:08,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:08,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:08,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:27:10,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 14:27:12,841 INFO [train.py:1039] (2/4) Epoch 21, batch 4850, loss[loss=0.1546, simple_loss=0.2322, pruned_loss=0.03849, over 24625.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2521, pruned_loss=0.04916, over 4724368.92 frames. ], batch size: 60, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:27:12,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 14:27:12,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:12,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:13,104 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:13,106 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:16,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:27:26,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 14:27:27,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:31,564 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.01 vs. limit=15.0 2023-09-30 14:27:32,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:33,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:27:34,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:38,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:39,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:27:41,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:27:41,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 14:27:43,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=740680.0, ans=0.1 2023-09-30 14:27:43,259 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=740680.0, ans=0.2 2023-09-30 14:27:45,275 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.90 vs. limit=22.5 2023-09-30 14:27:46,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:47,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:27:47,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:27:49,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:27:49,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 14:27:52,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:52,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:56,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:56,275 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 14:27:56,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 14:27:56,556 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:27:57,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:28:06,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:28:07,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 14:28:07,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:28:07,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:28:07,810 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=740813.3333333334, ans=0.0 2023-09-30 14:28:10,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:28:13,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 14:28:13,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:14,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 14:28:14,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:16,304 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:16,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 14:28:27,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:32,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:28:32,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:35,180 INFO [train.py:1039] (2/4) Epoch 21, batch 4900, loss[loss=0.1599, simple_loss=0.2414, pruned_loss=0.03925, over 24343.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2508, pruned_loss=0.04898, over 4728851.63 frames. ], batch size: 61, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:28:37,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 14:28:37,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:28:41,179 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=740946.6666666666, ans=0.125 2023-09-30 14:28:42,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:44,061 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:44,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:28:47,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 14:28:52,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 14:28:56,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 14:28:57,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 14:28:57,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:28:57,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:57,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:28:57,890 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:57,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:28:59,365 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 14:29:03,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 14:29:05,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:29:05,616 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=741013.3333333334, ans=0.2 2023-09-30 14:29:06,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:29:08,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:29:10,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:29:12,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:12,270 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:12,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 14:29:15,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:29:16,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:29:16,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 14:29:16,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 14:29:17,286 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.865e+02 2.105e+02 2.513e+02 4.105e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 14:29:20,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 14:29:22,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:29:25,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:29:25,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:29:27,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:27,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:29:27,124 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:29:27,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 14:29:30,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:32,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:29:33,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:29:37,070 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 14:29:38,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:29:38,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 14:29:40,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 14:29:48,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:29:50,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:29:51,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 14:29:51,994 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:29:52,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:29:53,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:58,061 INFO [train.py:1039] (2/4) Epoch 21, batch 4950, loss[loss=0.181, simple_loss=0.2481, pruned_loss=0.057, over 23702.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2499, pruned_loss=0.04853, over 4735666.63 frames. ], batch size: 164, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:29:58,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:29:58,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:30:00,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:30:00,172 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 14:30:01,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:30:05,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:05,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:30:08,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 14:30:08,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 14:30:08,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:30:10,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 14:30:10,122 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:10,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:30:12,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:30:12,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:13,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:15,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:30:16,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:30:18,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:19,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=741346.6666666666, ans=0.2 2023-09-30 14:30:20,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:20,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:30:23,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:30:28,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:30,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:30:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:32,027 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=741413.3333333334, ans=0.0 2023-09-30 14:30:33,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:35,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:30:35,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 14:30:35,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 14:30:39,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:42,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:30:42,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:30:42,486 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=741413.3333333334, ans=0.0 2023-09-30 14:30:43,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:30:43,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:30:45,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:30:48,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:49,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:30:51,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:30:53,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:55,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:55,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 14:30:55,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:30:57,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:31:00,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:31:03,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:31:03,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:31:03,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:03,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:31:05,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:31:06,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:31:08,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:31:08,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:31:09,696 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.70 vs. limit=15.0 2023-09-30 14:31:10,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 14:31:16,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:19,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 14:31:19,369 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:31:22,305 INFO [train.py:1039] (2/4) Epoch 21, batch 5000, loss[loss=0.1951, simple_loss=0.2688, pruned_loss=0.06073, over 23444.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2496, pruned_loss=0.04839, over 4745236.26 frames. ], batch size: 106, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:31:27,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:27,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:29,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 14:31:31,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 14:31:32,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:31:34,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=741613.3333333334, ans=0.1 2023-09-30 14:31:36,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 14:31:36,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:31:37,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:31:37,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 14:31:39,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:39,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:31:40,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 14:31:40,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:40,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:31:43,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 14:31:45,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 14:31:45,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:31:45,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 14:31:45,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:31:47,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:49,529 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:31:49,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 14:31:49,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 14:31:52,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 14:31:52,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:54,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:54,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 14:31:54,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:55,861 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:57,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:58,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:32:00,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 14:32:01,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:32:01,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:32:03,257 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.857e+02 2.049e+02 2.517e+02 4.196e+02, threshold=4.099e+02, percent-clipped=0.0 2023-09-30 14:32:05,672 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=741746.6666666666, ans=0.0 2023-09-30 14:32:06,981 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 14:32:10,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:32:11,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:32:11,596 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:14,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 14:32:16,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:32:16,095 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:16,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:32:19,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 14:32:19,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:21,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:23,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:32:28,644 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.69 vs. limit=22.5 2023-09-30 14:32:29,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 14:32:32,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:43,723 INFO [train.py:1039] (2/4) Epoch 21, batch 5050, loss[loss=0.1672, simple_loss=0.2468, pruned_loss=0.04377, over 15571.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2502, pruned_loss=0.04879, over 4728275.49 frames. ], batch size: 33, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:32:43,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:45,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:45,527 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:32:45,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:45,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:32:47,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:32:47,132 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:51,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:51,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 14:32:53,342 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:32:57,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:59,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:32:59,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 14:32:59,408 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=742013.3333333334, ans=0.125 2023-09-30 14:32:59,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=742013.3333333334, ans=0.125 2023-09-30 14:33:00,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:02,065 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:33:03,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:33:03,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:33:05,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:33:05,566 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=742013.3333333334, ans=0.0 2023-09-30 14:33:15,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 14:33:16,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:33:17,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:18,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 14:33:20,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:20,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:20,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:20,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:33:20,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 14:33:21,799 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 14:33:23,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:25,225 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=742080.0, ans=0.125 2023-09-30 14:33:26,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:29,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:30,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 14:33:33,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:36,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 14:33:37,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:33:37,609 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:33:37,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:33:37,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:39,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:33:42,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:33:42,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:42,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:33:42,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:33:43,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 14:33:45,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:33:47,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:52,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:52,627 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 14:33:52,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:33:54,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:33:54,180 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:54,250 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 14:33:56,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:56,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 14:33:56,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:57,742 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=742213.3333333334, ans=0.0 2023-09-30 14:34:00,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:01,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:02,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 14:34:03,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 14:34:05,789 INFO [train.py:1039] (2/4) Epoch 21, batch 5100, loss[loss=0.1512, simple_loss=0.2257, pruned_loss=0.03837, over 19564.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2506, pruned_loss=0.04891, over 4728333.52 frames. ], batch size: 42, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:34:06,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=742280.0, ans=0.2 2023-09-30 14:34:06,340 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=742280.0, ans=0.07 2023-09-30 14:34:07,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:07,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:07,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:34:10,611 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 14:34:13,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:34:15,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 14:34:15,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 14:34:15,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:19,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:34:22,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:34:22,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 14:34:23,375 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 14:34:27,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:28,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:34:31,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:35,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 14:34:36,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:38,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:38,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:34:40,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:43,615 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:43,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 14:34:45,202 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 14:34:45,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:46,539 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.007e+02 2.200e+02 2.519e+02 3.504e+02, threshold=4.400e+02, percent-clipped=0.0 2023-09-30 14:34:46,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 14:34:46,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 14:34:49,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:51,729 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=742413.3333333334, ans=0.0 2023-09-30 14:34:59,905 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=742480.0, ans=0.125 2023-09-30 14:35:01,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:02,527 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.65 vs. limit=10.0 2023-09-30 14:35:04,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 14:35:04,694 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 14:35:04,707 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 14:35:06,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 14:35:06,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:35:07,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 14:35:10,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=742546.6666666666, ans=0.0 2023-09-30 14:35:13,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 14:35:15,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:35:16,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:35:19,911 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 14:35:21,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:35:22,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 14:35:28,123 INFO [train.py:1039] (2/4) Epoch 21, batch 5150, loss[loss=0.1771, simple_loss=0.2525, pruned_loss=0.05083, over 23762.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2516, pruned_loss=0.04945, over 4719191.66 frames. ], batch size: 195, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:35:28,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:35:28,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:35:28,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:35:29,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:35:31,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:35:32,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:35:33,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 14:35:33,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 14:35:34,035 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 14:35:34,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:35:35,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 14:35:35,611 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:35,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:35:37,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:39,607 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:45,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:35:45,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 14:35:47,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:48,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:35:51,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:35:51,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:35:51,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:35:51,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:35:51,755 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:35:51,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 14:35:54,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:35:54,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:35:57,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:35:59,438 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 14:35:59,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:36:05,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=742746.6666666666, ans=0.125 2023-09-30 14:36:06,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:36:06,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 14:36:11,186 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:36:16,761 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.28 vs. limit=22.5 2023-09-30 14:36:18,446 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=742813.3333333334, ans=0.0 2023-09-30 14:36:20,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:20,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:21,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=742813.3333333334, ans=0.2 2023-09-30 14:36:24,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:24,842 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:26,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 14:36:32,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:36:33,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:36:33,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:36:35,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:37,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:39,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 14:36:45,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:45,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:36:47,702 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:47,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:36:49,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:36:49,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:36:49,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:36:50,684 INFO [train.py:1039] (2/4) Epoch 21, batch 5200, loss[loss=0.1676, simple_loss=0.2505, pruned_loss=0.04238, over 24327.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2525, pruned_loss=0.04983, over 4712987.69 frames. ], batch size: 61, lr: 4.84e-03, grad_scale: 32.0 2023-09-30 14:36:50,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:36:51,763 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=742946.6666666666, ans=0.5 2023-09-30 14:36:54,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:36:56,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:36:59,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:02,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 14:37:04,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:37:05,108 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.18 vs. limit=15.0 2023-09-30 14:37:05,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:06,477 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.17 vs. limit=12.0 2023-09-30 14:37:07,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:08,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:37:08,900 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:10,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 14:37:15,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:37:15,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:18,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 14:37:21,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:37:21,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:37:23,661 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 14:37:23,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 14:37:25,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 14:37:27,488 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:27,494 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 14:37:27,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:29,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:29,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:37:30,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 14:37:31,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:37:32,276 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.872e+02 2.039e+02 2.379e+02 3.474e+02, threshold=4.079e+02, percent-clipped=0.0 2023-09-30 14:37:33,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:35,642 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 14:37:35,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 14:37:37,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 14:37:43,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 14:37:43,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:37:48,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=743146.6666666666, ans=0.04949747468305833 2023-09-30 14:37:49,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:37:49,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:37:51,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 14:37:51,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:52,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 14:37:52,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:52,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:37:56,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:37:57,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:38:03,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:38:04,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:04,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:08,970 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.98 vs. limit=22.5 2023-09-30 14:38:09,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:09,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 14:38:11,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:38:11,138 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:38:12,504 INFO [train.py:1039] (2/4) Epoch 21, batch 5250, loss[loss=0.1762, simple_loss=0.2298, pruned_loss=0.0613, over 19535.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2521, pruned_loss=0.04968, over 4707334.92 frames. ], batch size: 388, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:38:12,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:14,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:38:14,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:38:17,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:38:20,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:20,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:38:22,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:38:26,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:28,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:38:30,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:38:33,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:38:35,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 14:38:35,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:37,237 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:59,833 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=743480.0, ans=0.125 2023-09-30 14:39:01,386 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=743480.0, ans=0.125 2023-09-30 14:39:16,534 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=743546.6666666666, ans=0.2 2023-09-30 14:39:21,118 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=12.0 2023-09-30 14:39:25,623 INFO [train.py:1039] (2/4) Epoch 21, batch 5300, loss[loss=0.1807, simple_loss=0.2633, pruned_loss=0.04909, over 23707.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.252, pruned_loss=0.04953, over 4721350.05 frames. ], batch size: 85, lr: 4.84e-03, grad_scale: 8.0 2023-09-30 14:39:31,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=743613.3333333334, ans=0.1 2023-09-30 14:39:40,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:39:40,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 14:39:41,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 14:39:41,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:41,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:41,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:41,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:41,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:42,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:39:42,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:42,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:39:42,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:39:42,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 14:39:42,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 14:39:43,001 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 14:39:43,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:39:43,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 14:39:43,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 14:39:43,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:44,021 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:44,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:44,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:44,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:39:45,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:45,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:45,376 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:45,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:45,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:45,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:39:45,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:45,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:39:46,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 14:39:46,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:47,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:47,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 14:39:47,134 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 14:39:47,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:39:47,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:39:47,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 14:39:47,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 14:39:47,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:48,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:39:49,042 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:49,196 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 14:39:49,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 14:39:49,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:39:49,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:49,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 14:39:49,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 14:39:49,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 14:39:50,088 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:58,328 INFO [train.py:1039] (2/4) Epoch 22, batch 0, loss[loss=0.1754, simple_loss=0.2556, pruned_loss=0.04756, over 24494.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2556, pruned_loss=0.04756, over 24494.00 frames. ], batch size: 66, lr: 4.73e-03, grad_scale: 16.0 2023-09-30 14:39:58,329 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 14:40:11,496 INFO [train.py:1071] (2/4) Epoch 22, validation: loss=0.3042, simple_loss=0.2741, pruned_loss=0.1671, over 1125622.00 frames. 2023-09-30 14:40:11,497 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 14:40:14,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 14:40:15,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:40:16,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:40:21,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:21,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:40:22,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:22,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 14:40:25,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 14:40:28,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:28,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:31,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:31,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:32,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:40:32,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:35,703 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.911e+02 2.208e+02 2.678e+02 6.793e+02, threshold=4.416e+02, percent-clipped=10.0 2023-09-30 14:40:35,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 14:40:37,455 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:46,320 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:40:46,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:46,598 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=743826.6666666666, ans=0.05 2023-09-30 14:40:48,526 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 14:40:54,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:40:54,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:40:57,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:01,584 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:41:04,744 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=743893.3333333334, ans=0.0 2023-09-30 14:41:06,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:06,215 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=743893.3333333334, ans=0.0 2023-09-30 14:41:10,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 14:41:13,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 14:41:15,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:15,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:15,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:41:16,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:41:19,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 14:41:21,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:24,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:29,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:41:31,599 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 14:41:33,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:41:35,324 INFO [train.py:1039] (2/4) Epoch 22, batch 50, loss[loss=0.1738, simple_loss=0.265, pruned_loss=0.04129, over 24506.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2535, pruned_loss=0.04924, over 1077679.05 frames. ], batch size: 71, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:41:38,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:40,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:40,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 14:41:40,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:41:40,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:41:43,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:43,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:43,678 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=744026.6666666666, ans=0.1 2023-09-30 14:41:46,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:51,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 14:41:51,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:59,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:42:00,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 14:42:02,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 14:42:03,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:42:06,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:06,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:06,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:08,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:42:09,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:42:09,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:16,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:19,735 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:19,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:42:19,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 14:42:22,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:42:24,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:42:24,222 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 14:42:24,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:25,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 14:42:34,358 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=744226.6666666666, ans=0.125 2023-09-30 14:42:34,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-09-30 14:42:35,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:42:35,552 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:37,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:39,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:39,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:41,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 14:42:41,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 14:42:43,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:43,261 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:44,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:46,314 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:47,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 14:42:47,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 14:42:49,995 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:42:52,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:52,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:42:52,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 14:42:53,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 14:42:54,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:54,623 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:56,174 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:42:57,525 INFO [train.py:1039] (2/4) Epoch 22, batch 100, loss[loss=0.1607, simple_loss=0.2335, pruned_loss=0.04401, over 24439.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2521, pruned_loss=0.04899, over 1899582.52 frames. ], batch size: 58, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:42:57,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:43:00,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:43:03,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:43:07,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:07,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 14:43:07,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:43:12,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:43:12,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:12,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:43:12,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:43:12,856 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:14,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 14:43:16,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:43:17,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:17,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:17,622 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:19,458 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=744426.6666666666, ans=0.125 2023-09-30 14:43:22,475 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.832e+02 2.020e+02 2.279e+02 4.259e+02, threshold=4.040e+02, percent-clipped=0.0 2023-09-30 14:43:22,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 14:43:22,763 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:24,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:25,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:43:27,265 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:43:31,623 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 14:43:31,660 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 14:43:33,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:43:33,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:43:35,144 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=744493.3333333334, ans=0.1 2023-09-30 14:43:36,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:43:38,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:38,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:43,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=744493.3333333334, ans=0.125 2023-09-30 14:43:45,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:47,243 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 14:43:47,941 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.68 vs. limit=15.0 2023-09-30 14:43:48,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:43:52,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=744560.0, ans=0.0 2023-09-30 14:43:53,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:43:53,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:43:56,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:00,024 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:03,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:04,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:44:06,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=744626.6666666666, ans=0.0 2023-09-30 14:44:07,709 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:07,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:08,082 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=744626.6666666666, ans=0.125 2023-09-30 14:44:10,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:10,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:44:10,716 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:12,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 14:44:12,224 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 14:44:12,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:14,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:44:14,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:14,469 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:14,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:44:14,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:44:14,605 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:44:16,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:17,488 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=744626.6666666666, ans=0.2 2023-09-30 14:44:18,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:18,546 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:18,669 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=744693.3333333334, ans=0.125 2023-09-30 14:44:19,966 INFO [train.py:1039] (2/4) Epoch 22, batch 150, loss[loss=0.1544, simple_loss=0.2363, pruned_loss=0.03623, over 24273.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2525, pruned_loss=0.04974, over 2523784.54 frames. ], batch size: 61, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:44:20,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:44:21,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:44:24,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:27,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:27,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:44:27,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:30,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:30,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:30,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=744693.3333333334, ans=0.125 2023-09-30 14:44:34,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:44:35,983 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:40,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 14:44:40,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 14:44:40,589 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 14:44:43,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:44:43,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:44:45,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:44:47,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:47,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:47,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:47,441 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:48,923 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 14:44:51,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:57,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:00,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:45:01,747 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 14:45:06,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:45:06,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:08,142 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:09,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:45:11,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:45:12,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:45:12,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:12,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 14:45:17,674 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:19,156 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:19,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:45:19,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:45:22,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:24,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 14:45:26,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:45:28,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:45:29,811 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:31,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:45:32,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 14:45:32,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:32,881 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 14:45:37,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:40,980 INFO [train.py:1039] (2/4) Epoch 22, batch 200, loss[loss=0.1759, simple_loss=0.2653, pruned_loss=0.04326, over 24021.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2534, pruned_loss=0.04974, over 3014608.89 frames. ], batch size: 80, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:45:41,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:45:41,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:45:42,788 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=745026.6666666666, ans=0.125 2023-09-30 14:45:44,926 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.15 vs. limit=15.0 2023-09-30 14:45:45,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 14:45:45,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:47,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:48,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 14:45:50,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:45:51,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:53,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:58,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:45:58,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:58,353 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:05,550 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.853e+02 2.014e+02 2.371e+02 3.492e+02, threshold=4.028e+02, percent-clipped=0.0 2023-09-30 14:46:10,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=745093.3333333334, ans=0.125 2023-09-30 14:46:18,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:46:19,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:46:20,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:46:20,229 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=745160.0, ans=0.2 2023-09-30 14:46:21,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:46:23,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 14:46:23,063 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:46:23,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:24,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:46:25,602 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.90 vs. limit=22.5 2023-09-30 14:46:26,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:26,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:46:28,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 14:46:28,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:46:28,557 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:33,356 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=745226.6666666666, ans=0.125 2023-09-30 14:46:35,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:46:41,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:47,849 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:47,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:46:57,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:00,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 14:47:01,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:01,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:47:02,341 INFO [train.py:1039] (2/4) Epoch 22, batch 250, loss[loss=0.1659, simple_loss=0.2475, pruned_loss=0.0422, over 24444.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2528, pruned_loss=0.04996, over 3394519.17 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:47:02,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:02,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:47:04,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 14:47:04,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:47:04,204 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 14:47:06,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:09,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:47:09,545 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:11,040 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.36 vs. limit=22.5 2023-09-30 14:47:11,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:13,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:47:15,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:16,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:47:18,913 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.46 vs. limit=15.0 2023-09-30 14:47:19,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:47:27,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=745426.6666666666, ans=0.04949747468305833 2023-09-30 14:47:32,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:35,449 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:35,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:47:35,762 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=745493.3333333334, ans=0.125 2023-09-30 14:47:43,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:47:43,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:47:45,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:47:45,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:47,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:47:47,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:47:47,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:51,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:47:52,313 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=745560.0, ans=0.05 2023-09-30 14:47:55,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 14:47:55,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:56,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:47:56,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:47:56,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:47:57,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:47:57,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:47:58,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:48:00,115 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:01,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:48:03,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:05,280 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:48:06,991 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=745626.6666666666, ans=0.0 2023-09-30 14:48:09,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:12,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:48:18,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:19,605 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:48:25,003 INFO [train.py:1039] (2/4) Epoch 22, batch 300, loss[loss=0.1694, simple_loss=0.2536, pruned_loss=0.04266, over 24484.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2506, pruned_loss=0.04923, over 3688060.56 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:48:25,054 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 14:48:25,204 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:48:25,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:48:28,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 14:48:28,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:48:29,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:48:29,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 14:48:34,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:36,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:48:39,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:48:41,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 14:48:42,636 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:44,157 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:48:44,183 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 14:48:44,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:48:49,077 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.843e+02 2.061e+02 2.404e+02 3.309e+02, threshold=4.123e+02, percent-clipped=0.0 2023-09-30 14:48:50,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:48:53,672 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:48:53,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 14:48:53,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=745760.0, ans=0.0 2023-09-30 14:48:57,575 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 14:48:57,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:00,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:02,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:02,192 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 14:49:02,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:49:05,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:49:06,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:49:06,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:10,645 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:49:10,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 14:49:12,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:49:15,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:16,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 14:49:16,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:22,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:49:25,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:49:25,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 14:49:30,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:30,465 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:49:31,873 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:34,077 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:49:35,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 14:49:35,642 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:49:35,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:37,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 14:49:38,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:38,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:41,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:42,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:42,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:45,857 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=746026.6666666666, ans=0.0 2023-09-30 14:49:47,005 INFO [train.py:1039] (2/4) Epoch 22, batch 350, loss[loss=0.1585, simple_loss=0.2376, pruned_loss=0.03973, over 24464.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2498, pruned_loss=0.04844, over 3928787.35 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:49:48,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:49:48,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:49:49,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=746026.6666666666, ans=0.1 2023-09-30 14:49:53,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:57,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:50:01,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:01,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:05,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 14:50:07,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:07,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 14:50:10,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:10,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 14:50:12,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:15,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 14:50:18,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:50:20,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:22,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:50:23,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:23,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:23,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:50:27,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:50:27,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:30,778 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:50:35,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:50:35,074 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:50:35,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:50:35,199 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:40,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 14:50:40,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:45,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:45,780 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:50:45,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:47,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 14:50:49,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:50:49,649 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 14:50:49,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=746226.6666666666, ans=0.1 2023-09-30 14:50:51,302 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 14:50:52,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:55,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:55,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 14:50:58,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:00,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:51:02,974 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=746293.3333333334, ans=0.125 2023-09-30 14:51:04,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:05,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:05,521 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:07,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:10,212 INFO [train.py:1039] (2/4) Epoch 22, batch 400, loss[loss=0.1556, simple_loss=0.2282, pruned_loss=0.04153, over 24443.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2489, pruned_loss=0.04805, over 4103133.70 frames. ], batch size: 58, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:51:10,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:51:11,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:51:13,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 14:51:13,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:14,432 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=746360.0, ans=0.09899494936611666 2023-09-30 14:51:15,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:15,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:51:17,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:20,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:22,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:24,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 14:51:27,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 14:51:27,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:28,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 14:51:30,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:33,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:51:33,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:33,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 14:51:33,907 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=746426.6666666666, ans=0.125 2023-09-30 14:51:34,930 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.861e+02 2.105e+02 2.651e+02 3.953e+02, threshold=4.209e+02, percent-clipped=0.0 2023-09-30 14:51:35,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:51:35,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:36,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:36,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:38,923 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 14:51:40,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 14:51:43,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=746493.3333333334, ans=0.0 2023-09-30 14:51:45,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:46,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:46,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 14:51:49,060 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=746493.3333333334, ans=0.07 2023-09-30 14:51:50,067 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 14:51:53,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:51:55,330 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:02,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 14:52:02,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=746560.0, ans=0.1 2023-09-30 14:52:05,403 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:52:05,562 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=746560.0, ans=0.0 2023-09-30 14:52:08,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 14:52:08,607 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:52:09,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:52:12,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:52:12,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 14:52:15,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:52:18,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:52:19,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:52:23,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:23,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 14:52:25,349 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:52:31,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 14:52:33,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:52:33,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:52:35,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 14:52:35,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:52:36,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:52:37,735 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.18 vs. limit=10.0 2023-09-30 14:52:39,083 INFO [train.py:1039] (2/4) Epoch 22, batch 450, loss[loss=0.1775, simple_loss=0.2676, pruned_loss=0.04371, over 24452.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.249, pruned_loss=0.04822, over 4236796.15 frames. ], batch size: 69, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:52:39,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:52:40,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 14:52:42,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:52:42,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:52:43,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:52:43,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 14:52:43,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:52:45,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:52:47,113 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=746693.3333333334, ans=0.1 2023-09-30 14:52:48,201 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:52:56,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:58,120 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:00,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 14:53:00,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 14:53:03,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:53:06,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:07,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:14,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:16,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:18,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 14:53:19,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 14:53:21,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 14:53:21,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:53:23,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:24,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:53:25,115 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 14:53:25,128 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 14:53:25,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.33 vs. limit=10.0 2023-09-30 14:53:26,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:28,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:53:28,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:53:32,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:53:32,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:53:34,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 14:53:34,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 14:53:36,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:38,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:53:38,439 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:53:38,708 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=746893.3333333334, ans=0.1 2023-09-30 14:53:40,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 14:53:45,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:53:46,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 14:53:46,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 14:53:47,517 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-09-30 14:53:48,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:53,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:53:56,410 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:53:58,562 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:53:58,616 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 14:54:01,600 INFO [train.py:1039] (2/4) Epoch 22, batch 500, loss[loss=0.1611, simple_loss=0.2448, pruned_loss=0.03869, over 24453.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2502, pruned_loss=0.04882, over 4342332.92 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:54:01,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:03,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:54:03,322 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:03,337 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 14:54:04,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 14:54:04,962 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:09,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:54:13,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:54:14,997 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=747026.6666666666, ans=0.0 2023-09-30 14:54:16,110 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:54:17,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:54:17,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:19,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:23,871 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=747093.3333333334, ans=0.0 2023-09-30 14:54:24,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=747093.3333333334, ans=0.125 2023-09-30 14:54:26,495 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.831e+02 2.107e+02 2.492e+02 3.806e+02, threshold=4.214e+02, percent-clipped=0.0 2023-09-30 14:54:30,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:31,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:54:31,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:54:31,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:33,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 14:54:33,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:54:38,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:54:39,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:54:39,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:54:39,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:41,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 14:54:45,661 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 14:54:47,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:54:48,826 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.16 vs. limit=15.0 2023-09-30 14:54:49,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:52,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:54:55,409 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 14:54:56,335 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=747226.6666666666, ans=15.0 2023-09-30 14:54:59,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:55:01,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:01,594 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=747226.6666666666, ans=0.125 2023-09-30 14:55:04,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:07,912 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=747293.3333333334, ans=0.125 2023-09-30 14:55:09,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:55:15,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:17,651 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 14:55:17,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:17,696 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:20,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 14:55:22,181 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:55:24,280 INFO [train.py:1039] (2/4) Epoch 22, batch 550, loss[loss=0.1817, simple_loss=0.2526, pruned_loss=0.05534, over 23298.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.251, pruned_loss=0.04892, over 4428983.22 frames. ], batch size: 105, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 14:55:24,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:27,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 14:55:30,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 14:55:30,707 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:30,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 14:55:30,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:55:32,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:32,836 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:32,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:32,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:55:35,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:55:38,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:39,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 14:55:39,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:55:46,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:55:46,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:47,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:55:49,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:53,948 WARNING [train.py:1197] (2/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 14:55:54,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 14:55:55,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:55:58,763 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.37 vs. limit=15.0 2023-09-30 14:56:02,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:56:02,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:03,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:56:07,852 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=747493.3333333334, ans=0.1 2023-09-30 14:56:09,071 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:09,089 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 14:56:10,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:56:12,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 14:56:14,334 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:15,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:56:15,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:56:17,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:17,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 14:56:19,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 14:56:21,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:21,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:56:21,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:56:21,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:56:24,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:56:25,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:56:28,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:56:29,073 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:30,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:56:32,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:56:34,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:35,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:56:35,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:38,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:56:38,788 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:56:44,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 14:56:47,714 INFO [train.py:1039] (2/4) Epoch 22, batch 600, loss[loss=0.2331, simple_loss=0.2932, pruned_loss=0.08653, over 19649.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2515, pruned_loss=0.04928, over 4489308.62 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:56:49,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 14:56:50,750 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:56:50,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:56:50,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:57,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:57:00,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:57:00,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=747693.3333333334, ans=0.125 2023-09-30 14:57:02,005 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 14:57:04,279 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:57:07,096 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:08,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:11,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 14:57:11,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:57:13,357 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.858e+02 2.031e+02 2.339e+02 3.248e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 14:57:18,905 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 14:57:22,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:57:22,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:23,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:57:28,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:57:28,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:57:30,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:37,243 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:57:40,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:40,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:40,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:49,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 14:57:54,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:57:56,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:57:59,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 14:58:01,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:58:04,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 14:58:04,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:58:04,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:58:10,799 INFO [train.py:1039] (2/4) Epoch 22, batch 650, loss[loss=0.1608, simple_loss=0.2426, pruned_loss=0.03953, over 24610.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2497, pruned_loss=0.04887, over 4530390.24 frames. ], batch size: 65, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:58:10,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:58:13,185 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:58:16,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:58:17,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:58:19,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:21,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 14:58:21,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=748026.6666666666, ans=0.125 2023-09-30 14:58:23,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:58:26,570 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=748093.3333333334, ans=0.05 2023-09-30 14:58:29,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:58:29,909 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:33,650 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=748093.3333333334, ans=0.0 2023-09-30 14:58:34,736 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:35,098 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=748093.3333333334, ans=0.0 2023-09-30 14:58:38,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 14:58:40,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:58:41,614 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:43,552 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=748160.0, ans=0.2 2023-09-30 14:58:44,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:58:44,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 14:58:45,001 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=748160.0, ans=0.125 2023-09-30 14:58:47,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:47,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:48,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:58:51,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:51,640 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:58:54,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:58:54,736 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 14:58:54,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:54,786 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:58:59,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:59,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:01,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:01,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:59:01,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 14:59:04,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:59:04,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:59:06,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:59:06,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:08,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:59:09,778 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 14:59:11,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 14:59:11,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:11,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:59:12,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:59:12,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:59:14,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:59:20,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:20,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:59:22,125 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:59:24,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:24,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 14:59:25,746 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:32,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:59:32,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:32,517 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:32,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:33,879 INFO [train.py:1039] (2/4) Epoch 22, batch 700, loss[loss=0.1775, simple_loss=0.2446, pruned_loss=0.05525, over 23911.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2494, pruned_loss=0.04828, over 4575073.22 frames. ], batch size: 195, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:59:37,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 14:59:37,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 14:59:41,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 14:59:41,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:43,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:59:45,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 14:59:50,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:53,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:59:55,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:58,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:59:58,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:00:00,128 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.824e+02 1.972e+02 2.211e+02 2.960e+02, threshold=3.944e+02, percent-clipped=0.0 2023-09-30 15:00:01,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:00:07,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:00:07,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:00:07,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 15:00:11,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 15:00:12,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=748493.3333333334, ans=0.0 2023-09-30 15:00:15,938 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:00:16,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:00:18,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:00:22,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:00:23,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 15:00:23,805 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=748560.0, ans=0.125 2023-09-30 15:00:26,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:26,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:00:28,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 15:00:30,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:00:31,038 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=748560.0, ans=0.125 2023-09-30 15:00:32,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:35,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:00:40,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:00:41,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 15:00:45,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 15:00:45,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 15:00:49,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:52,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:00:53,748 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:00:53,990 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:53,999 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 15:00:56,880 INFO [train.py:1039] (2/4) Epoch 22, batch 750, loss[loss=0.1639, simple_loss=0.2462, pruned_loss=0.04075, over 24456.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2486, pruned_loss=0.04795, over 4616602.70 frames. ], batch size: 63, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 15:00:58,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 15:00:58,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 15:00:58,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 15:01:00,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 15:01:00,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 15:01:00,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:01:01,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 15:01:03,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:01:03,889 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:05,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:06,974 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:08,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:01:08,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:13,414 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:01:14,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:01:16,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:01:20,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:20,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:20,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 15:01:24,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:01:24,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:25,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:27,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:01:28,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 15:01:28,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:01:30,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 15:01:30,445 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 15:01:31,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 15:01:31,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:01:33,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:01:34,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:01:41,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:41,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:01:41,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:01:41,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:45,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:45,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 15:01:46,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:01:46,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:01:48,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:01:52,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:01:54,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 15:01:54,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:01:55,512 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.92 vs. limit=22.5 2023-09-30 15:02:00,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:01,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:02:01,923 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=748960.0, ans=0.0 2023-09-30 15:02:03,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:03,436 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=748960.0, ans=0.125 2023-09-30 15:02:06,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:02:10,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 15:02:10,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:11,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:13,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:14,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:16,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:17,955 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:02:19,332 INFO [train.py:1039] (2/4) Epoch 22, batch 800, loss[loss=0.214, simple_loss=0.2761, pruned_loss=0.07602, over 19608.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2492, pruned_loss=0.04884, over 4632321.28 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:02:26,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:26,566 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:28,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:28,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:30,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:30,482 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:33,473 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:37,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:37,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=749093.3333333334, ans=0.0 2023-09-30 15:02:38,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:02:40,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 15:02:41,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:41,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:42,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:02:43,487 WARNING [train.py:1197] (2/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:43,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 15:02:45,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:45,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 15:02:46,021 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=749093.3333333334, ans=0.0 2023-09-30 15:02:47,015 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.835e+02 2.019e+02 2.284e+02 4.447e+02, threshold=4.039e+02, percent-clipped=1.0 2023-09-30 15:02:48,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:51,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:54,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:56,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:59,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:59,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:01,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:03,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:03:03,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 15:03:06,926 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 15:03:06,971 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 15:03:07,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:03:08,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:09,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:10,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:12,492 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=749226.6666666666, ans=0.125 2023-09-30 15:03:15,332 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 15:03:15,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 15:03:16,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:03:18,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:03:18,936 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=749226.6666666666, ans=0.125 2023-09-30 15:03:21,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=749226.6666666666, ans=0.125 2023-09-30 15:03:22,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:03:25,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:27,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 15:03:28,904 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:03:30,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 15:03:38,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:42,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:03:42,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 15:03:42,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:03:44,065 INFO [train.py:1039] (2/4) Epoch 22, batch 850, loss[loss=0.1706, simple_loss=0.2567, pruned_loss=0.04224, over 24519.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2502, pruned_loss=0.04908, over 4652501.09 frames. ], batch size: 71, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:03:44,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:45,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 15:03:45,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:47,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:03:49,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:50,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:03:51,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=749360.0, ans=0.125 2023-09-30 15:03:52,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:52,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 15:03:53,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 15:03:53,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 15:03:55,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:55,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:59,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:59,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:59,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:04:04,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:04,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:04,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 15:04:07,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 15:04:12,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:13,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 15:04:14,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 15:04:16,432 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 15:04:19,526 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 15:04:19,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:19,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:04:19,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:04:24,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:24,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:25,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 15:04:26,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:28,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:28,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:04:29,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:04:31,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:04:32,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:04:32,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 15:04:36,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:04:36,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:04:38,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:04:38,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:04:41,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:44,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:46,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:04:49,293 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:04:49,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:04:50,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:04:55,713 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=749626.6666666666, ans=0.125 2023-09-30 15:05:00,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:05:02,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:05:02,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 15:05:02,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:02,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:05:05,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 15:05:07,237 INFO [train.py:1039] (2/4) Epoch 22, batch 900, loss[loss=0.1775, simple_loss=0.2659, pruned_loss=0.04452, over 24671.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2514, pruned_loss=0.04967, over 4660706.53 frames. ], batch size: 68, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:05:09,696 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.63 vs. limit=15.0 2023-09-30 15:05:13,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:05:15,713 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.08 vs. limit=12.0 2023-09-30 15:05:18,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:18,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 15:05:21,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:05:22,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 15:05:23,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:05:25,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:25,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:25,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:05:26,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:05:28,535 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:05:32,804 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.809e+02 2.095e+02 2.574e+02 4.591e+02, threshold=4.190e+02, percent-clipped=1.0 2023-09-30 15:05:36,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:05:36,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:36,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:05:38,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:41,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=749826.6666666666, ans=0.125 2023-09-30 15:05:43,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 15:05:43,483 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=749826.6666666666, ans=0.0 2023-09-30 15:05:44,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:05:48,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:05:49,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:05:50,050 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 15:05:50,302 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=749826.6666666666, ans=0.125 2023-09-30 15:05:51,514 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 15:05:51,792 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=749826.6666666666, ans=0.0 2023-09-30 15:05:59,666 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:05:59,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:05:59,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:06:07,393 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:07,421 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:06:09,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 15:06:10,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:06:12,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 15:06:15,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:06:15,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:17,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:06:17,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:24,066 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 15:06:25,507 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 15:06:25,703 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:06:25,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 15:06:28,568 INFO [train.py:1039] (2/4) Epoch 22, batch 950, loss[loss=0.1624, simple_loss=0.237, pruned_loss=0.04387, over 24604.00 frames. ], tot_loss[loss=0.176, simple_loss=0.252, pruned_loss=0.05002, over 4667550.07 frames. ], batch size: 60, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:06:28,784 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:33,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 15:06:37,355 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=750026.6666666666, ans=0.1 2023-09-30 15:06:38,679 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:41,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:41,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:43,284 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:06:46,336 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 15:06:48,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:50,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:06:52,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:52,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:06:52,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 15:06:52,396 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:06:55,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:56,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 15:06:56,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:06:57,125 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=750093.3333333334, ans=0.0 2023-09-30 15:07:00,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:00,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:07:00,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:07:00,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 15:07:04,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:07:06,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:07:07,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:07:07,830 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=750160.0, ans=0.1 2023-09-30 15:07:12,154 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:07:12,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:07:16,654 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 15:07:20,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:07:20,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:07:20,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:22,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:22,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:07:27,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 15:07:27,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:07:30,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:32,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:32,366 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 15:07:32,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:32,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:07:32,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 15:07:32,808 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:07:37,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:07:40,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:45,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:07:47,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 15:07:47,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 15:07:51,493 INFO [train.py:1039] (2/4) Epoch 22, batch 1000, loss[loss=0.1699, simple_loss=0.2622, pruned_loss=0.0388, over 24652.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2507, pruned_loss=0.0495, over 4685373.10 frames. ], batch size: 73, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:07:54,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:57,063 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=750360.0, ans=0.0 2023-09-30 15:07:57,489 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.86 vs. limit=12.0 2023-09-30 15:07:58,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 15:07:58,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:02,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=750360.0, ans=0.125 2023-09-30 15:08:03,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:08:05,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 15:08:05,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 15:08:10,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:10,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:08:14,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:15,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 15:08:18,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 15:08:19,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=750426.6666666666, ans=0.2 2023-09-30 15:08:20,245 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.832e+02 2.013e+02 2.220e+02 3.757e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 15:08:21,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 15:08:21,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:24,796 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 15:08:24,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 15:08:26,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 15:08:27,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:28,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:32,429 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=15.0 2023-09-30 15:08:38,496 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:38,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.48 vs. limit=15.0 2023-09-30 15:08:39,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:08:40,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:40,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:40,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 15:08:41,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:42,510 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.47 vs. limit=15.0 2023-09-30 15:08:43,779 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:08:43,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:43,941 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 15:08:47,786 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=750560.0, ans=0.125 2023-09-30 15:08:49,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 15:08:50,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 15:08:52,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 15:08:53,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:08:55,822 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=750560.0, ans=0.0 2023-09-30 15:08:58,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:58,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:08:59,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:00,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:09:02,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 15:09:02,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:09:02,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 15:09:04,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 15:09:04,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:04,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:09:09,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:09:11,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:09:14,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:09:14,354 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=750693.3333333334, ans=0.02 2023-09-30 15:09:15,504 INFO [train.py:1039] (2/4) Epoch 22, batch 1050, loss[loss=0.1599, simple_loss=0.2331, pruned_loss=0.04339, over 18964.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2485, pruned_loss=0.04883, over 4679699.00 frames. ], batch size: 41, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:09:17,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:09:21,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:09:22,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:09:24,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:24,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:29,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:09:30,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:09:31,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.whiten.whitening_limit, batch_count=750760.0, ans=15.0 2023-09-30 15:09:32,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:09:33,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:09:33,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:09:34,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:09:35,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 15:09:35,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:37,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 15:09:41,273 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:41,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 15:09:41,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:09:49,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:50,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:09:50,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:52,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 15:09:53,456 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-09-30 15:09:54,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 15:09:54,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:57,837 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 15:09:58,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 15:09:59,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:02,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:10:04,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:10:05,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:06,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:10:11,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:10:14,356 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 15:10:16,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 15:10:16,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 15:10:16,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:17,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:10:19,417 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 15:10:23,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:10:24,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:24,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:10:24,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:24,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:29,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:29,869 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 15:10:32,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:32,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 15:10:32,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 15:10:34,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:10:37,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:10:38,852 INFO [train.py:1039] (2/4) Epoch 22, batch 1100, loss[loss=0.1664, simple_loss=0.2561, pruned_loss=0.03841, over 24318.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2481, pruned_loss=0.04815, over 4692273.33 frames. ], batch size: 74, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:10:45,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:10:50,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:10:50,353 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=751026.6666666666, ans=0.125 2023-09-30 15:10:53,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:10:53,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:10:53,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 15:10:55,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:56,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:10:59,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:11:02,281 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:11:02,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 15:11:02,557 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751093.3333333334, ans=0.1 2023-09-30 15:11:05,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:11:06,737 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:06,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:11:08,138 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.829e+02 2.096e+02 2.478e+02 4.106e+02, threshold=4.191e+02, percent-clipped=1.0 2023-09-30 15:11:08,526 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=751093.3333333334, ans=0.2 2023-09-30 15:11:09,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:11:09,960 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:11:13,417 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=751160.0, ans=0.125 2023-09-30 15:11:14,754 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=751160.0, ans=0.125 2023-09-30 15:11:16,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:11:19,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 15:11:19,942 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 15:11:20,239 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=751160.0, ans=0.04949747468305833 2023-09-30 15:11:21,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:22,940 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:24,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:11:24,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:11:26,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 15:11:27,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:11:27,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:11:28,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:11:29,442 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:29,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 15:11:34,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:11:34,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 15:11:37,949 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:11:41,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:11:44,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 15:11:44,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:11:45,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:47,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:47,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=751293.3333333334, ans=0.125 2023-09-30 15:11:48,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:50,851 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 15:11:52,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:11:52,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:52,545 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 15:11:54,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:11:54,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 15:11:56,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:11:56,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:11:57,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:12:00,851 INFO [train.py:1039] (2/4) Epoch 22, batch 1150, loss[loss=0.181, simple_loss=0.2529, pruned_loss=0.05454, over 23348.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2496, pruned_loss=0.04886, over 4686769.15 frames. ], batch size: 285, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:12:01,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:04,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:12:07,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:08,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:12:08,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 15:12:09,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:12,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 15:12:15,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:15,418 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:12:21,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 15:12:23,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:28,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:29,664 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:29,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 15:12:29,741 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:12:29,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:35,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 15:12:36,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:38,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:46,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751493.3333333334, ans=0.1 2023-09-30 15:12:48,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:53,107 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.46 vs. limit=22.5 2023-09-30 15:12:55,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:56,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 15:12:56,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:12:56,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:01,978 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 15:13:03,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:10,497 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 15:13:12,246 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=751626.6666666666, ans=0.1 2023-09-30 15:13:17,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:18,639 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:13:18,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:13:18,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:13:22,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:23,906 INFO [train.py:1039] (2/4) Epoch 22, batch 1200, loss[loss=0.1712, simple_loss=0.2627, pruned_loss=0.03981, over 24326.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2504, pruned_loss=0.04936, over 4699126.18 frames. ], batch size: 74, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:13:27,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:13:27,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:13:28,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:28,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:30,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:13:33,781 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:13:35,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:13:35,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:35,602 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:38,667 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 15:13:41,717 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 15:13:43,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:13:45,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:13:47,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:47,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=751760.0, ans=0.125 2023-09-30 15:13:51,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:13:51,395 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 15:13:52,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:54,769 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=751760.0, ans=0.0 2023-09-30 15:13:55,738 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.455e+02 1.831e+02 1.967e+02 2.216e+02 3.733e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 15:14:00,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:14:00,718 WARNING [train.py:1197] (2/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:14:00,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 15:14:00,872 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:14:05,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 15:14:07,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=751826.6666666666, ans=0.125 2023-09-30 15:14:07,667 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=751826.6666666666, ans=0.2 2023-09-30 15:14:10,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 15:14:10,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:14:11,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:14:13,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:13,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:14:15,159 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:14:15,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:14:15,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:14:16,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 15:14:18,282 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:14:18,354 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:18,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:14:18,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=751893.3333333334, ans=0.05 2023-09-30 15:14:22,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:22,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:27,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:14:29,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:14:32,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 15:14:36,884 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 15:14:40,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:14:42,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:43,749 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:14:45,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:46,634 INFO [train.py:1039] (2/4) Epoch 22, batch 1250, loss[loss=0.174, simple_loss=0.2468, pruned_loss=0.0506, over 23546.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2507, pruned_loss=0.04965, over 4713640.76 frames. ], batch size: 134, lr: 4.70e-03, grad_scale: 4.0 2023-09-30 15:14:48,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 15:14:52,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.35 vs. limit=12.0 2023-09-30 15:14:52,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:14:53,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:14:54,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 15:14:54,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:14:56,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:15:00,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:15:02,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:02,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:15:02,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:05,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=752093.3333333334, ans=0.0 2023-09-30 15:15:06,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:15:10,419 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.51 vs. limit=22.5 2023-09-30 15:15:10,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:15:10,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:15:10,940 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:12,540 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:14,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:17,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:18,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:15:24,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 15:15:24,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:15:26,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:26,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 15:15:28,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:28,357 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 15:15:28,383 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:28,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:31,122 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=752160.0, ans=0.5 2023-09-30 15:15:32,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:32,521 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=752160.0, ans=0.0 2023-09-30 15:15:36,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:37,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:15:39,059 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 15:15:39,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 15:15:40,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 15:15:43,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:15:43,757 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 15:15:43,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:48,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:15:48,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:15:49,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=752226.6666666666, ans=0.1 2023-09-30 15:15:50,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 15:15:50,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:15:51,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:15:51,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:15:53,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:54,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 15:15:57,790 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:57,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:15:58,718 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.09 vs. limit=22.5 2023-09-30 15:15:59,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:15:59,680 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=752293.3333333334, ans=0.125 2023-09-30 15:16:01,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:16:06,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:16:08,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 15:16:09,839 INFO [train.py:1039] (2/4) Epoch 22, batch 1300, loss[loss=0.1602, simple_loss=0.2369, pruned_loss=0.04178, over 24605.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2514, pruned_loss=0.04991, over 4718256.80 frames. ], batch size: 60, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:16:11,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:11,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:16:13,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:14,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:16:16,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:16:17,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 15:16:23,635 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.15 vs. limit=22.5 2023-09-30 15:16:25,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:16:25,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:16:27,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 15:16:30,610 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=752426.6666666666, ans=0.1 2023-09-30 15:16:31,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:16:35,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:37,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:16:37,384 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:39,015 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:41,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:16:41,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:16:41,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 15:16:43,116 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.847e+02 1.992e+02 2.255e+02 3.036e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 15:16:46,622 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=752493.3333333334, ans=0.125 2023-09-30 15:16:47,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:16:47,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:16:50,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 15:16:52,489 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:16:54,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:16:57,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:57,692 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 15:16:57,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:16:59,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 15:17:00,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:06,874 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:06,878 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:17:09,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 15:17:10,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 15:17:10,304 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=752560.0, ans=0.125 2023-09-30 15:17:12,299 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 15:17:19,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:17:22,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 15:17:23,762 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:30,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 15:17:31,376 INFO [train.py:1039] (2/4) Epoch 22, batch 1350, loss[loss=0.1497, simple_loss=0.2003, pruned_loss=0.04955, over 19646.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.25, pruned_loss=0.04953, over 4722171.95 frames. ], batch size: 389, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:17:35,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:38,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:17:39,920 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:41,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:41,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:17:43,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:48,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:50,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 15:17:50,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:17:52,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:17:55,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 15:17:57,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:58,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:58,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 15:18:00,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 15:18:00,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=752760.0, ans=0.125 2023-09-30 15:18:02,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 15:18:02,288 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=752760.0, ans=0.125 2023-09-30 15:18:03,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:03,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 15:18:04,375 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.57 vs. limit=15.0 2023-09-30 15:18:15,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:25,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:26,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:26,108 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 15:18:29,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:31,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 15:18:31,368 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:18:31,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:18:35,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:18:37,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 15:18:39,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:18:45,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 15:18:47,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 15:18:53,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 15:18:53,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:55,672 INFO [train.py:1039] (2/4) Epoch 22, batch 1400, loss[loss=0.1834, simple_loss=0.2513, pruned_loss=0.05776, over 23805.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2486, pruned_loss=0.04902, over 4708246.05 frames. ], batch size: 212, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:18:57,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:18:57,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:19:01,693 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.28 vs. limit=15.0 2023-09-30 15:19:04,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 15:19:06,177 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 15:19:12,422 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=22.5 2023-09-30 15:19:14,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:19:16,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:19,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:19:19,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:19:25,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:19:25,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:19:29,129 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.828e+02 2.054e+02 2.330e+02 3.479e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 15:19:38,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:38,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:42,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 15:19:44,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:19:44,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:19:45,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:19:47,798 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:47,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:19:47,953 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:19:48,045 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:19:50,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 15:19:50,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:19:55,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:58,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:20:08,689 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 15:20:08,816 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:20:10,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:20:12,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:20:16,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:17,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:20:19,329 INFO [train.py:1039] (2/4) Epoch 22, batch 1450, loss[loss=0.1613, simple_loss=0.2466, pruned_loss=0.03804, over 24652.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2481, pruned_loss=0.04882, over 4710419.78 frames. ], batch size: 65, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:20:21,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:20:23,305 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:20:23,318 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:23,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:20:28,857 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.52 vs. limit=10.0 2023-09-30 15:20:29,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:29,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:20:31,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:20:32,380 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 15:20:32,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:20:33,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 15:20:35,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:35,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:35,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 15:20:37,220 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:20:38,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:20:38,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 15:20:38,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:40,359 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:20:43,695 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:47,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:51,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:20:51,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:20:54,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:54,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:55,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:57,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:20:57,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:58,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:02,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 15:21:04,136 WARNING [train.py:1197] (2/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:21:07,260 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 15:21:08,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:10,404 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:21:11,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:13,746 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=753560.0, ans=0.125 2023-09-30 15:21:14,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 15:21:16,735 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:21:18,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:20,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 15:21:21,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 15:21:23,474 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:26,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:26,542 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:28,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 15:21:31,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 15:21:33,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 15:21:33,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:35,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:21:41,324 INFO [train.py:1039] (2/4) Epoch 22, batch 1500, loss[loss=0.1717, simple_loss=0.249, pruned_loss=0.04724, over 20955.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.249, pruned_loss=0.04893, over 4712023.84 frames. ], batch size: 45, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:21:45,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 15:21:46,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:21:46,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:21:47,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:47,657 WARNING [train.py:1197] (2/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:49,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:21:50,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 15:21:52,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:21:53,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:21:53,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:54,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:56,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:21:57,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:21:59,435 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=753760.0, ans=0.125 2023-09-30 15:22:02,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:02,767 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 15:22:04,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:04,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:22:05,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:07,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 15:22:12,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 15:22:14,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:22:14,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 15:22:15,596 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.832e+02 2.019e+02 2.350e+02 4.853e+02, threshold=4.037e+02, percent-clipped=1.0 2023-09-30 15:22:17,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:22:20,209 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:22:21,690 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:21,712 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:22:23,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 15:22:23,336 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:22:23,479 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=753826.6666666666, ans=0.125 2023-09-30 15:22:24,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:24,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 15:22:24,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:30,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:22:30,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 15:22:37,086 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:22:39,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:22:42,276 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 15:22:43,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:43,716 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 15:22:46,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:22:46,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:22:48,211 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 15:22:49,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:52,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 15:22:52,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:57,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:57,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:58,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:58,974 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:59,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:23:03,216 INFO [train.py:1039] (2/4) Epoch 22, batch 1550, loss[loss=0.1667, simple_loss=0.239, pruned_loss=0.04717, over 24325.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2502, pruned_loss=0.04934, over 4710144.87 frames. ], batch size: 56, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:23:03,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 15:23:04,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 15:23:04,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:23:06,937 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 15:23:07,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 15:23:08,592 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:10,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:10,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:11,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:23:11,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:13,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:17,001 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 15:23:17,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:17,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:23:18,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:23:21,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:23:21,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 15:23:21,724 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:23,048 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 15:23:24,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 15:23:24,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 15:23:24,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:26,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:30,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:23:32,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 15:23:32,624 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 15:23:38,525 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=754160.0, ans=0.0 2023-09-30 15:23:39,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=754160.0, ans=0.125 2023-09-30 15:23:40,416 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.38 vs. limit=15.0 2023-09-30 15:23:41,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:47,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:48,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:23:48,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:23:49,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 15:23:56,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:23:56,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:59,355 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:24:00,971 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:24:02,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:02,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 15:24:03,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:05,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:24:05,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:06,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:24:07,002 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 15:24:10,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:16,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 15:24:22,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:22,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:24,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 15:24:26,438 INFO [train.py:1039] (2/4) Epoch 22, batch 1600, loss[loss=0.1697, simple_loss=0.2464, pruned_loss=0.04649, over 24601.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2508, pruned_loss=0.0493, over 4714164.12 frames. ], batch size: 60, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:24:26,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:28,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:28,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:24:28,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:24:29,700 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:24:34,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:34,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 15:24:35,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 15:24:37,606 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=754360.0, ans=0.2 2023-09-30 15:24:38,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 15:24:40,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:24:41,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 15:24:42,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:24:45,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:24:50,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:54,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 15:24:57,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:24:58,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 15:24:59,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:00,929 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.869e+02 2.022e+02 2.307e+02 3.871e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 15:25:01,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 15:25:07,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 15:25:08,980 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=754493.3333333334, ans=0.125 2023-09-30 15:25:13,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:15,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 15:25:16,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:16,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:25:16,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:25:18,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 15:25:19,853 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=754560.0, ans=0.2 2023-09-30 15:25:25,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 15:25:26,597 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:25:26,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:26,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=754560.0, ans=0.125 2023-09-30 15:25:28,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:28,235 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:25:28,364 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=754560.0, ans=0.1 2023-09-30 15:25:31,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:25:32,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:25:33,518 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:25:40,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:40,325 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:25:42,319 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=754626.6666666666, ans=0.125 2023-09-30 15:25:43,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 15:25:43,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:25:43,772 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=754626.6666666666, ans=0.125 2023-09-30 15:25:45,399 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=754626.6666666666, ans=0.125 2023-09-30 15:25:46,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 15:25:47,987 INFO [train.py:1039] (2/4) Epoch 22, batch 1650, loss[loss=0.1603, simple_loss=0.2536, pruned_loss=0.03349, over 24305.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2519, pruned_loss=0.04936, over 4712660.64 frames. ], batch size: 74, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:25:51,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:25:51,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:25:51,998 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.83 vs. limit=15.0 2023-09-30 15:25:52,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:25:52,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 15:25:52,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 15:25:52,890 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 15:25:52,966 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 15:25:53,224 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=754693.3333333334, ans=0.125 2023-09-30 15:25:58,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:58,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:00,161 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:00,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:26:03,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:06,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 15:26:08,445 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:26:08,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:08,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:26:08,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:26:09,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 15:26:10,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 15:26:13,571 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=754760.0, ans=0.0 2023-09-30 15:26:16,402 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:26:16,766 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=754760.0, ans=0.0 2023-09-30 15:26:19,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:26:23,281 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.34 vs. limit=15.0 2023-09-30 15:26:27,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 15:26:27,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:28,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 15:26:32,200 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.52 vs. limit=10.0 2023-09-30 15:26:32,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:35,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:26:35,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:26:37,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:26:38,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:26:38,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:41,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:42,431 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.49 vs. limit=15.0 2023-09-30 15:26:43,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:43,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:44,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:46,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:47,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:26:50,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:52,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 15:26:53,794 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:53,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 15:26:55,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 15:26:56,884 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 15:26:56,915 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:57,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:26:57,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:58,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:58,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 15:27:01,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:27:04,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:04,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:07,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 15:27:10,547 INFO [train.py:1039] (2/4) Epoch 22, batch 1700, loss[loss=0.1671, simple_loss=0.2282, pruned_loss=0.05295, over 23442.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2512, pruned_loss=0.04937, over 4706282.39 frames. ], batch size: 285, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:27:12,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:12,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:27:14,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 15:27:16,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:16,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:27:16,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:17,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:27:19,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:27:19,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 15:27:22,316 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:27:22,892 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=755026.6666666666, ans=0.125 2023-09-30 15:27:24,420 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=755026.6666666666, ans=0.125 2023-09-30 15:27:26,692 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.34 vs. limit=10.0 2023-09-30 15:27:27,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:30,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:27:37,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:27:37,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:27:37,244 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:39,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:27:42,338 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 15:27:43,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:27:43,991 WARNING [train.py:1197] (2/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:45,841 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.451e+02 1.865e+02 2.081e+02 2.403e+02 3.253e+02, threshold=4.162e+02, percent-clipped=0.0 2023-09-30 15:27:46,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:27:48,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:27:49,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 15:27:51,373 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 15:27:51,554 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:53,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 15:27:54,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:56,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=755160.0, ans=0.125 2023-09-30 15:28:03,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:05,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:05,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=755226.6666666666, ans=0.125 2023-09-30 15:28:06,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:28:07,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:28:07,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 15:28:07,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:28:10,768 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:10,770 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 15:28:10,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:28:10,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:13,018 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:13,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:13,448 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=755226.6666666666, ans=0.1 2023-09-30 15:28:16,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:16,155 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:28:17,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:19,767 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:28:19,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:23,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:25,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 15:28:28,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:29,616 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:32,606 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 15:28:32,875 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=755360.0, ans=0.1 2023-09-30 15:28:34,062 INFO [train.py:1039] (2/4) Epoch 22, batch 1750, loss[loss=0.1566, simple_loss=0.2384, pruned_loss=0.03738, over 24444.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2498, pruned_loss=0.04922, over 4711316.78 frames. ], batch size: 58, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:28:38,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:41,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:41,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:28:43,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 15:28:43,381 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:44,069 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.16 vs. limit=22.5 2023-09-30 15:28:46,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:28:46,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:51,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=755426.6666666666, ans=0.125 2023-09-30 15:28:52,435 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 15:28:54,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:57,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 15:28:57,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:59,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:29:01,541 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=755426.6666666666, ans=0.0 2023-09-30 15:29:02,726 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:29:02,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 15:29:05,198 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.76 vs. limit=15.0 2023-09-30 15:29:06,031 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:29:06,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 15:29:08,357 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.00 vs. limit=12.0 2023-09-30 15:29:13,740 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:29:16,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:16,837 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:17,191 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=755493.3333333334, ans=0.2 2023-09-30 15:29:20,003 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:20,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:22,355 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:29:24,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:28,297 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:28,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:29:29,095 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.67 vs. limit=22.5 2023-09-30 15:29:29,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 15:29:32,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:33,187 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.67 vs. limit=15.0 2023-09-30 15:29:35,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 15:29:36,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:38,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:38,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:29:43,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:29:44,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:29:44,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:46,171 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:50,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:51,645 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.56 vs. limit=6.0 2023-09-30 15:29:52,513 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:29:53,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:29:54,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 15:29:56,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:56,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:29:56,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:29:56,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:29:56,643 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=755693.3333333334, ans=0.2 2023-09-30 15:29:58,267 INFO [train.py:1039] (2/4) Epoch 22, batch 1800, loss[loss=0.157, simple_loss=0.2381, pruned_loss=0.03794, over 24676.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2489, pruned_loss=0.04888, over 4718809.95 frames. ], batch size: 65, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:29:58,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:29:58,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:30:01,986 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:30:02,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:30:02,342 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:30:03,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:30:05,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:30:10,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:30:11,832 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:30:13,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:16,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:30:19,771 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:30:19,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 15:30:21,151 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:22,943 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=755760.0, ans=0.125 2023-09-30 15:30:23,084 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=755760.0, ans=0.0 2023-09-30 15:30:24,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:27,886 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 15:30:31,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 15:30:31,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 15:30:33,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:33,269 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=755826.6666666666, ans=0.125 2023-09-30 15:30:34,970 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.922e+02 2.224e+02 2.607e+02 3.579e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-30 15:30:35,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:35,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:30:37,197 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:30:42,193 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 15:30:43,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:30:45,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:48,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 15:30:49,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 15:30:49,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:30:51,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:30:52,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:30:53,362 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=755893.3333333334, ans=0.0 2023-09-30 15:30:57,590 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 15:31:04,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:04,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 15:31:05,932 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:31:05,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:07,368 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.31 vs. limit=6.0 2023-09-30 15:31:08,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:31:08,075 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 15:31:09,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:31:09,790 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:10,187 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=755960.0, ans=0.0 2023-09-30 15:31:11,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 15:31:11,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:15,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:16,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:31:16,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,334 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:31:19,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:31:20,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:21,269 INFO [train.py:1039] (2/4) Epoch 22, batch 1850, loss[loss=0.1923, simple_loss=0.2641, pruned_loss=0.06025, over 22882.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2498, pruned_loss=0.04929, over 4714123.56 frames. ], batch size: 322, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:31:24,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:31:24,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:31:32,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:31:32,280 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 15:31:35,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 15:31:39,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 15:31:44,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:44,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 15:31:45,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 15:31:54,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:55,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 15:31:58,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:31:58,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:03,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 15:32:04,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:04,975 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:32:06,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:32:08,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:32:13,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:32:17,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:32:17,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:17,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:32:17,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:18,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:20,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:32:22,530 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.00 vs. limit=15.0 2023-09-30 15:32:23,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 15:32:25,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:29,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:32:31,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:32:31,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 15:32:31,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 15:32:33,529 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.92 vs. limit=22.5 2023-09-30 15:32:34,192 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 15:32:34,319 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 15:32:35,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:32:35,926 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:35,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:35,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:37,476 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 15:32:37,501 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:32:37,556 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:39,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:32:39,304 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:32:40,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:32:41,990 INFO [train.py:1039] (2/4) Epoch 22, batch 1900, loss[loss=0.1833, simple_loss=0.2503, pruned_loss=0.05812, over 23802.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2503, pruned_loss=0.04956, over 4719132.52 frames. ], batch size: 212, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:32:42,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:32:42,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 15:32:42,509 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=756360.0, ans=0.125 2023-09-30 15:32:43,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:43,792 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 15:32:43,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:32:45,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:52,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:54,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:32:54,809 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756360.0, ans=0.1 2023-09-30 15:32:56,160 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 15:32:57,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 15:32:59,150 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:59,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:33:01,260 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 15:33:01,303 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 15:33:04,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 15:33:07,349 WARNING [train.py:1197] (2/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:33:09,092 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=756426.6666666666, ans=0.0 2023-09-30 15:33:11,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 15:33:12,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 15:33:15,563 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=756493.3333333334, ans=0.125 2023-09-30 15:33:18,273 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.802e+02 1.975e+02 2.316e+02 3.738e+02, threshold=3.949e+02, percent-clipped=0.0 2023-09-30 15:33:23,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 15:33:27,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 15:33:27,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:33:28,028 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 15:33:28,034 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 15:33:29,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 15:33:29,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 15:33:29,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:33:32,829 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 15:33:36,039 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:33:38,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:33:38,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 15:33:41,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:33:41,954 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.42 vs. limit=15.0 2023-09-30 15:33:44,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 15:33:45,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:33:46,128 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=756560.0, ans=0.0 2023-09-30 15:33:53,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:33:53,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:33:53,430 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:33:54,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:33:55,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:33:57,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:33:58,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:34:00,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:00,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:03,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:34:03,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:05,172 INFO [train.py:1039] (2/4) Epoch 22, batch 1950, loss[loss=0.1767, simple_loss=0.2454, pruned_loss=0.05401, over 23664.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2515, pruned_loss=0.05019, over 4705189.19 frames. ], batch size: 149, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:34:05,275 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:34:06,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:10,089 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:13,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:34:13,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:13,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:34:18,083 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 15:34:18,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:34:18,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:19,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:20,126 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=756760.0, ans=0.125 2023-09-30 15:34:22,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:34:22,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:22,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:25,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:34:26,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=756760.0, ans=0.2 2023-09-30 15:34:29,051 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:29,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:34:29,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:34:30,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:34,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:37,152 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.75 vs. limit=15.0 2023-09-30 15:34:38,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:38,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:38,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:34:38,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 15:34:40,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:34:40,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:34:41,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:46,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:47,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:52,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:34:57,649 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:34:57,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:34:57,758 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 15:34:59,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:02,694 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=756893.3333333334, ans=0.07 2023-09-30 15:35:03,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:35:03,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:35:05,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:05,794 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756893.3333333334, ans=0.1 2023-09-30 15:35:14,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:14,952 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:15,195 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=756960.0, ans=10.0 2023-09-30 15:35:17,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:19,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:22,800 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:35:22,864 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:24,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 15:35:24,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:35:25,690 WARNING [train.py:1197] (2/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:35:26,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=757026.6666666666, ans=0.0 2023-09-30 15:35:27,161 INFO [train.py:1039] (2/4) Epoch 22, batch 2000, loss[loss=0.1666, simple_loss=0.2323, pruned_loss=0.05047, over 22701.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2513, pruned_loss=0.04971, over 4709112.31 frames. ], batch size: 322, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:35:27,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 15:35:29,472 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:32,584 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:34,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:35:34,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:37,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:35:38,679 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:41,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 15:35:42,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:35:44,676 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:35:46,332 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 15:35:49,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:35:49,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:52,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:35:54,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 15:35:54,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:56,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:56,297 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=757093.3333333334, ans=0.0 2023-09-30 15:35:57,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:57,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 15:35:59,098 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:36:00,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 15:36:00,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:04,189 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.020e+02 2.308e+02 2.637e+02 3.987e+02, threshold=4.617e+02, percent-clipped=1.0 2023-09-30 15:36:04,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:05,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:36:05,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:05,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:07,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:08,887 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 15:36:11,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 15:36:11,907 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:11,919 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:17,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:18,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:36:18,549 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:18,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:36:20,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:22,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:23,814 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:23,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:23,996 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:27,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:27,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 15:36:33,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:36:33,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:38,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:38,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:36:41,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:43,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:43,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:43,651 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=757293.3333333334, ans=0.125 2023-09-30 15:36:45,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:36:45,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:36:50,065 INFO [train.py:1039] (2/4) Epoch 22, batch 2050, loss[loss=0.171, simple_loss=0.2197, pruned_loss=0.06113, over 19611.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2509, pruned_loss=0.0497, over 4704663.89 frames. ], batch size: 388, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:36:50,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:51,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:53,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=757360.0, ans=0.125 2023-09-30 15:36:54,391 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.77 vs. limit=10.0 2023-09-30 15:36:55,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:55,257 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:55,524 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=757360.0, ans=0.0 2023-09-30 15:36:57,118 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=757360.0, ans=0.125 2023-09-30 15:37:01,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:37:03,032 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:37:03,126 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:37:04,601 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:06,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 15:37:06,203 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:37:06,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:07,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:37:11,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=757426.6666666666, ans=0.125 2023-09-30 15:37:17,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:17,608 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:19,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 15:37:22,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:22,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=757493.3333333334, ans=0.125 2023-09-30 15:37:23,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 15:37:23,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:27,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:30,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:32,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:37:32,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:35,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:37:36,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:37:36,552 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:37:39,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:41,467 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:37:43,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:37:43,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=757560.0, ans=0.2 2023-09-30 15:37:45,129 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:48,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:37:55,037 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:58,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 15:38:03,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:04,033 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=757626.6666666666, ans=0.1 2023-09-30 15:38:05,241 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:38:07,039 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757626.6666666666, ans=0.1 2023-09-30 15:38:08,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:38:08,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 15:38:09,221 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.65 vs. limit=15.0 2023-09-30 15:38:11,762 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 15:38:11,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:11,860 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:13,225 INFO [train.py:1039] (2/4) Epoch 22, batch 2100, loss[loss=0.1844, simple_loss=0.2631, pruned_loss=0.05279, over 24100.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2499, pruned_loss=0.04945, over 4702011.48 frames. ], batch size: 80, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:38:13,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:13,450 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:13,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 15:38:14,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 15:38:17,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:38:21,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:38:21,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:38:21,745 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757693.3333333334, ans=0.1 2023-09-30 15:38:24,436 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:24,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:38:24,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 15:38:26,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:38:28,254 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 15:38:28,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 15:38:31,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:38:31,185 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:38:31,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 15:38:31,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 15:38:33,609 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=757760.0, ans=0.0 2023-09-30 15:38:38,267 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 15:38:38,269 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:38,723 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=757760.0, ans=0.07 2023-09-30 15:38:39,057 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.67 vs. limit=15.0 2023-09-30 15:38:41,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:38:41,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:43,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=757760.0, ans=0.125 2023-09-30 15:38:43,222 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=757760.0, ans=0.0 2023-09-30 15:38:46,068 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:38:46,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 15:38:47,808 WARNING [train.py:1197] (2/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:38:47,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:38:49,197 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.822e+02 2.007e+02 2.255e+02 3.053e+02, threshold=4.015e+02, percent-clipped=0.0 2023-09-30 15:38:49,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 15:38:49,490 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:49,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 15:38:50,902 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 15:38:50,965 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 15:38:52,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:38:56,226 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:38:57,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:38:59,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:39:01,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:05,078 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 15:39:05,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:05,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:06,635 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 15:39:08,892 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 15:39:08,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 15:39:13,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:39:15,296 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=757893.3333333334, ans=0.125 2023-09-30 15:39:17,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:39:18,056 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 15:39:22,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:24,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:39:25,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:39:25,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:39:25,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:39:26,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:39:27,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:27,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:39:29,377 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=757960.0, ans=0.125 2023-09-30 15:39:31,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:39:31,284 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:31,580 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=757960.0, ans=0.0 2023-09-30 15:39:32,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 15:39:34,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 15:39:34,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:36,500 INFO [train.py:1039] (2/4) Epoch 22, batch 2150, loss[loss=0.1806, simple_loss=0.2447, pruned_loss=0.05822, over 22729.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2494, pruned_loss=0.04917, over 4714216.34 frames. ], batch size: 322, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:39:37,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:37,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:39:37,374 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:39:38,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:39:44,243 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:39:45,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:47,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:49,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:39:49,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:50,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:39:53,618 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:53,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:39:53,720 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:39:56,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:58,215 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 15:40:01,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:05,182 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:40:07,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:07,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:40:08,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:08,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:40:10,250 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:40:10,389 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 15:40:12,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:40:14,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:14,693 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:16,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:40:16,346 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:40:19,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:20,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:40:21,096 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=758160.0, ans=0.125 2023-09-30 15:40:22,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:22,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 15:40:22,377 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:40:24,031 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:25,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:26,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:28,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:40:28,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:30,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:30,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 15:40:32,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 15:40:33,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:40:34,534 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 15:40:34,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:34,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:40:36,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 15:40:36,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:40:36,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 15:40:37,575 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 15:40:37,576 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 15:40:37,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 15:40:39,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:40,060 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:40,077 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:40:41,519 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:42,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:40:44,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:44,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:52,118 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:40:52,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 15:40:58,081 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:40:59,536 INFO [train.py:1039] (2/4) Epoch 22, batch 2200, loss[loss=0.1641, simple_loss=0.2536, pruned_loss=0.03731, over 24451.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2489, pruned_loss=0.04924, over 4712273.18 frames. ], batch size: 69, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:41:00,051 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=758360.0, ans=0.125 2023-09-30 15:41:01,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:02,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:41:02,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:04,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:41:07,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:41:08,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:41:08,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 15:41:14,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 15:41:17,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:41:24,474 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 15:41:26,177 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:26,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:27,685 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:41:30,981 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:41:31,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 15:41:34,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:41:34,414 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=758493.3333333334, ans=0.125 2023-09-30 15:41:36,961 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.820e+02 1.960e+02 2.258e+02 2.788e+02, threshold=3.920e+02, percent-clipped=0.0 2023-09-30 15:41:37,091 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:37,462 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=758493.3333333334, ans=0.125 2023-09-30 15:41:38,502 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 15:41:41,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:41:43,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:45,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:41:47,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:48,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 15:41:50,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:53,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 15:41:56,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:56,572 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:41:56,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:58,347 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=758560.0, ans=0.125 2023-09-30 15:41:59,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:59,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:59,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:59,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:42:01,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:42:01,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:42:04,346 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:42:07,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:42:08,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:10,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:42:10,770 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=758626.6666666666, ans=0.1 2023-09-30 15:42:12,064 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 15:42:12,403 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=758626.6666666666, ans=0.1 2023-09-30 15:42:13,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:42:15,373 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 15:42:15,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:42:16,925 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 15:42:18,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:18,955 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.93 vs. limit=15.0 2023-09-30 15:42:20,477 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:42:20,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:21,837 INFO [train.py:1039] (2/4) Epoch 22, batch 2250, loss[loss=0.1896, simple_loss=0.2613, pruned_loss=0.05891, over 23599.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2499, pruned_loss=0.04931, over 4718062.40 frames. ], batch size: 256, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:42:21,999 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 15:42:25,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:42:27,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:34,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:42:35,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:42:39,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:39,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:40,627 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:40,906 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=758760.0, ans=0.2 2023-09-30 15:42:42,308 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 15:42:42,326 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:42:42,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:42:43,976 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 15:42:45,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:42:45,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:47,171 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:51,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:53,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:42:55,072 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:42:56,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 15:42:58,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:43:03,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:43:07,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:07,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=758826.6666666666, ans=0.125 2023-09-30 15:43:08,341 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.66 vs. limit=15.0 2023-09-30 15:43:09,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:10,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:43:10,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:43:13,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:43:15,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:43:19,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:43:21,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:43:21,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.71 vs. limit=6.0 2023-09-30 15:43:25,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:43:25,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:43:27,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:43:28,387 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=758960.0, ans=0.1 2023-09-30 15:43:33,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:43:34,232 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=758960.0, ans=0.0 2023-09-30 15:43:35,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:43:35,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 15:43:35,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:37,448 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:43:41,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 15:43:44,066 INFO [train.py:1039] (2/4) Epoch 22, batch 2300, loss[loss=0.1765, simple_loss=0.2621, pruned_loss=0.04542, over 24340.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2511, pruned_loss=0.0495, over 4714264.02 frames. ], batch size: 74, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:43:44,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:43:44,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:50,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:50,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:43:53,673 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 15:43:55,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:02,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:44:03,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:44:03,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:03,952 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=759093.3333333334, ans=0.125 2023-09-30 15:44:05,050 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:05,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 15:44:05,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:44:08,182 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:08,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:44:11,868 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:44:14,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:44:18,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:21,553 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.881e+02 2.122e+02 2.530e+02 4.417e+02, threshold=4.245e+02, percent-clipped=2.0 2023-09-30 15:44:24,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:44:24,870 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:28,026 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:44:31,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:44:35,719 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:36,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:44:37,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:44:37,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 15:44:42,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:44:42,603 WARNING [train.py:1197] (2/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:43,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:44,028 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:44:45,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:44:45,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 15:44:45,570 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:44:47,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 15:44:47,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:44:47,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:49,249 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 15:44:55,359 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:44:58,585 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=759293.3333333334, ans=0.2 2023-09-30 15:44:59,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:45:04,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:04,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:45:04,764 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:45:06,150 INFO [train.py:1039] (2/4) Epoch 22, batch 2350, loss[loss=0.1634, simple_loss=0.2505, pruned_loss=0.03811, over 24675.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2513, pruned_loss=0.04995, over 4709282.06 frames. ], batch size: 73, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:45:07,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:45:07,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:07,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:45:09,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 15:45:13,465 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=759360.0, ans=0.0 2023-09-30 15:45:14,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:45:14,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 15:45:22,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 15:45:25,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:45:29,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:29,554 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:31,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 15:45:34,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:45:38,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 15:45:40,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:43,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:45:43,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:47,170 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:45:48,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 15:45:48,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:45:49,019 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=759493.3333333334, ans=0.125 2023-09-30 15:45:51,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:51,681 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:45:51,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:54,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:45:56,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 15:45:58,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:46:02,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:46:03,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:46:05,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 15:46:05,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:46:08,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 15:46:08,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:46:13,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 15:46:17,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 15:46:19,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:46:19,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:46:19,955 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 15:46:20,593 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.00 vs. limit=12.0 2023-09-30 15:46:21,333 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 15:46:22,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 15:46:26,058 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:46:27,432 INFO [train.py:1039] (2/4) Epoch 22, batch 2400, loss[loss=0.1901, simple_loss=0.2687, pruned_loss=0.05574, over 23758.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2506, pruned_loss=0.04971, over 4705189.95 frames. ], batch size: 85, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:46:30,772 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:46:34,913 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:46:35,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:46:37,093 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 15:46:37,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 15:46:44,931 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:46:44,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:46:46,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 15:46:47,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:46:49,464 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:49,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 15:46:57,378 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:59,009 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 15:47:02,286 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:47:05,701 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.822e+02 2.005e+02 2.211e+02 3.199e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 15:47:08,015 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 15:47:11,575 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:13,232 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:16,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:17,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 15:47:18,132 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=759893.3333333334, ans=0.125 2023-09-30 15:47:19,250 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:47:21,365 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=759893.3333333334, ans=0.0 2023-09-30 15:47:24,271 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:25,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=759893.3333333334, ans=0.125 2023-09-30 15:47:26,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=759893.3333333334, ans=0.125 2023-09-30 15:47:27,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:47:30,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:47:32,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:47:32,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:47:32,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:47:32,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:33,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:47:34,010 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:47:38,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:47:38,795 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=759960.0, ans=0.015 2023-09-30 15:47:39,017 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=759960.0, ans=0.125 2023-09-30 15:47:40,616 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:47:40,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 15:47:42,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 15:47:44,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:44,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:46,213 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 15:47:46,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 15:47:46,587 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=759960.0, ans=0.0 2023-09-30 15:47:47,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 15:47:47,702 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 15:47:47,844 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 15:47:49,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:50,800 INFO [train.py:1039] (2/4) Epoch 22, batch 2450, loss[loss=0.1604, simple_loss=0.2225, pruned_loss=0.04919, over 23420.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2494, pruned_loss=0.04919, over 4703545.80 frames. ], batch size: 285, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:47:50,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:50,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:47:52,387 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 15:47:52,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:52,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:47:57,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:47:57,299 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:00,988 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:00,998 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:02,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 15:48:07,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:08,619 WARNING [train.py:1197] (2/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:10,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:48:10,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:48:11,938 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:48:11,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 15:48:17,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:20,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:48:20,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:48:23,632 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:48:23,684 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:25,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:25,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:48:28,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 15:48:29,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:48:38,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:39,731 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:41,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:41,194 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:48:41,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:42,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:48:43,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 15:48:46,881 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:48,301 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:48:51,264 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=760226.6666666666, ans=15.0 2023-09-30 15:48:52,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:53,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:58,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:48:58,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 15:48:58,347 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:48:59,826 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:59,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 15:49:00,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=760293.3333333334, ans=0.125 2023-09-30 15:49:01,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:01,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:49:06,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:49:10,103 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:10,178 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:49:13,148 INFO [train.py:1039] (2/4) Epoch 22, batch 2500, loss[loss=0.169, simple_loss=0.254, pruned_loss=0.04202, over 24432.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2486, pruned_loss=0.04889, over 4705211.48 frames. ], batch size: 69, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:49:13,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 15:49:14,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:49:21,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:31,744 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:49:31,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:49:32,100 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=760426.6666666666, ans=0.125 2023-09-30 15:49:33,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:33,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 15:49:40,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:49:42,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:49:42,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:49:42,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:49:44,711 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 15:49:44,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:46,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:46,339 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 15:49:46,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:47,804 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 15:49:47,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:50,874 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.767e+02 1.934e+02 2.176e+02 2.965e+02, threshold=3.869e+02, percent-clipped=0.0 2023-09-30 15:49:52,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:52,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:55,149 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=760493.3333333334, ans=0.025 2023-09-30 15:49:56,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:49:56,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 15:49:56,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:49:58,852 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:03,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:08,092 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:09,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:14,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:50:16,737 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 15:50:18,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:18,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:18,401 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=760626.6666666666, ans=0.0 2023-09-30 15:50:19,804 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:50:19,805 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:50:21,236 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 15:50:21,237 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 15:50:21,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 15:50:24,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:25,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 15:50:27,936 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 15:50:28,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:30,140 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 15:50:33,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 15:50:36,284 INFO [train.py:1039] (2/4) Epoch 22, batch 2550, loss[loss=0.1745, simple_loss=0.2495, pruned_loss=0.0497, over 23796.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2489, pruned_loss=0.04874, over 4694414.64 frames. ], batch size: 232, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:50:36,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:37,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:50:37,951 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:50:40,979 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:41,094 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 15:50:42,477 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:50:44,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=760693.3333333334, ans=0.07 2023-09-30 15:50:45,754 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 15:50:47,227 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:50:49,585 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:52,582 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:52,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 15:50:53,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:50:54,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:50:54,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:55,972 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=760760.0, ans=0.125 2023-09-30 15:50:57,165 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:50:57,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 15:50:58,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:58,631 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:58,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 15:51:02,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=760760.0, ans=0.125 2023-09-30 15:51:10,981 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:51:11,167 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=760826.6666666666, ans=0.125 2023-09-30 15:51:13,331 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=12.0 2023-09-30 15:51:16,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:16,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:16,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:51:17,072 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:51:23,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:51:26,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:51:26,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:51:26,970 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:51:28,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:51:28,468 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:51:31,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:31,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:38,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:51:38,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 15:51:38,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:51:40,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:41,900 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:51:43,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:51:45,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:51,298 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:51:54,238 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:57,751 INFO [train.py:1039] (2/4) Epoch 22, batch 2600, loss[loss=0.1653, simple_loss=0.2412, pruned_loss=0.04467, over 23623.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2503, pruned_loss=0.04922, over 4691981.57 frames. ], batch size: 135, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:51:57,839 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 15:51:59,460 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 15:51:59,488 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:51:59,539 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 15:51:59,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 15:52:01,026 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 15:52:02,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:52:02,755 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 15:52:04,313 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 15:52:05,855 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 15:52:09,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:52:12,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 15:52:14,362 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 15:52:15,887 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:52:15,948 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 15:52:17,751 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=761093.3333333334, ans=0.1 2023-09-30 15:52:18,927 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 15:52:18,954 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 15:52:26,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:26,691 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:28,078 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:28,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 15:52:29,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:52:32,024 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=761160.0, ans=0.0 2023-09-30 15:52:34,441 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.852e+02 2.092e+02 2.401e+02 4.337e+02, threshold=4.185e+02, percent-clipped=2.0 2023-09-30 15:52:34,816 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=761160.0, ans=0.0 2023-09-30 15:52:37,583 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 15:52:43,760 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:45,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:45,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 15:52:47,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:52:47,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:47,559 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 15:52:49,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:52:49,941 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=761226.6666666666, ans=0.125 2023-09-30 15:52:51,086 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:52:52,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:57,171 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 15:52:57,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:58,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:53:04,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:53:05,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:53:05,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 15:53:07,089 WARNING [train.py:1197] (2/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:53:08,674 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:10,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:16,532 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 15:53:18,113 INFO [train.py:1039] (2/4) Epoch 22, batch 2650, loss[loss=0.1538, simple_loss=0.2362, pruned_loss=0.03575, over 24306.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2505, pruned_loss=0.04896, over 4708025.86 frames. ], batch size: 61, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:53:18,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:20,520 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:53:24,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 15:53:24,331 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:26,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:53:27,932 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 15:53:27,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:53:29,548 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:31,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:53:32,766 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:33,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=761360.0, ans=0.0 2023-09-30 15:53:35,925 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:53:36,043 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 15:53:36,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:53:37,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:53:37,759 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=761426.6666666666, ans=0.125 2023-09-30 15:53:40,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 15:53:42,573 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 15:53:44,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:48,480 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 15:53:49,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:53:49,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 15:53:54,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:54,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:53:54,604 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:56,577 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:53:59,859 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 15:53:59,869 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 15:54:03,738 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=761493.3333333334, ans=0.0 2023-09-30 15:54:05,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:09,628 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 15:54:09,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:09,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:11,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:11,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:11,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:11,551 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=761560.0, ans=0.125 2023-09-30 15:54:12,930 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:14,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:14,673 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:54:16,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:54:18,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:54:19,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:19,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:54:21,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:22,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:22,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:54:24,819 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=761626.6666666666, ans=0.125 2023-09-30 15:54:26,134 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:27,562 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:54:27,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:29,007 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 15:54:31,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:31,677 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=761626.6666666666, ans=0.2 2023-09-30 15:54:34,825 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:36,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:37,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:39,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:39,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:40,387 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=15.0 2023-09-30 15:54:41,044 INFO [train.py:1039] (2/4) Epoch 22, batch 2700, loss[loss=0.1648, simple_loss=0.2502, pruned_loss=0.03975, over 24308.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2509, pruned_loss=0.04878, over 4712661.64 frames. ], batch size: 61, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:54:42,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:54:42,553 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 15:54:44,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:54:47,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 15:54:50,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:50,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:50,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:51,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:54:52,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:52,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:54:52,522 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:54:52,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 15:54:52,649 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:54:54,267 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:55,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:54:57,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:01,004 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:55:01,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 15:55:02,607 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:08,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:55:08,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:14,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:55:14,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:55:14,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:55:16,504 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:55:19,394 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.905e+02 2.101e+02 2.408e+02 3.392e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-30 15:55:19,667 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:22,707 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:22,718 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:55:22,740 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:55:24,449 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=761826.6666666666, ans=0.2 2023-09-30 15:55:29,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:29,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:55:35,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:55:37,845 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:55:41,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:55:41,509 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:43,884 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:45,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:46,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:47,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:48,574 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:48,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:55:52,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:53,260 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=761960.0, ans=0.0 2023-09-30 15:55:54,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:54,380 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:57,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 15:55:57,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:59,860 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:55:59,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 15:56:01,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 15:56:02,825 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:04,255 INFO [train.py:1039] (2/4) Epoch 22, batch 2750, loss[loss=0.1759, simple_loss=0.2504, pruned_loss=0.0507, over 23548.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2506, pruned_loss=0.04846, over 4716777.05 frames. ], batch size: 149, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:56:05,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:05,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:07,457 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:07,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=762026.6666666666, ans=0.125 2023-09-30 15:56:09,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:56:09,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:14,908 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:14,973 WARNING [train.py:1197] (2/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:56:15,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:56:16,382 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:16,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 15:56:16,400 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:56:16,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:22,571 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 15:56:24,125 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:56:24,205 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:25,579 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:56:25,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:56:27,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:28,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:56:30,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:30,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:36,291 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:56:36,341 WARNING [train.py:1197] (2/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:56:36,397 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:56:38,454 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:39,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:56:45,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=762160.0, ans=0.2 2023-09-30 15:56:47,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:50,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:56:50,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:55,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:55,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:56:55,695 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:57:01,863 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:57:01,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:57:01,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 15:57:06,542 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=762226.6666666666, ans=0.125 2023-09-30 15:57:07,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:09,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 15:57:16,016 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:57:18,183 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:57:18,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 15:57:19,738 WARNING [train.py:1197] (2/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:57:19,963 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:57:21,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 15:57:21,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:57:25,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 15:57:26,874 INFO [train.py:1039] (2/4) Epoch 22, batch 2800, loss[loss=0.167, simple_loss=0.2453, pruned_loss=0.04441, over 23219.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2502, pruned_loss=0.04812, over 4732695.12 frames. ], batch size: 93, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:57:26,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:26,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:57:28,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 15:57:28,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:28,594 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:30,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:31,561 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 15:57:31,562 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 15:57:31,970 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=762360.0, ans=0.0 2023-09-30 15:57:34,669 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:34,893 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=762360.0, ans=0.125 2023-09-30 15:57:37,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:57:37,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:57:40,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:57:42,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 15:57:44,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:57:45,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 15:57:49,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:49,211 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:57:49,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:57:54,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:57:54,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:54,497 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:57:56,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:04,782 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.836e+02 2.011e+02 2.397e+02 3.435e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 15:58:06,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:58:07,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:09,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:11,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:58:11,127 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:15,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:15,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 15:58:17,439 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:18,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:18,966 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:58:23,020 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=762560.0, ans=0.125 2023-09-30 15:58:24,163 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:26,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:29,526 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:29,844 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=762560.0, ans=0.0 2023-09-30 15:58:31,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:58:31,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:31,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:58:33,307 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:58:33,403 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:58:35,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:58:35,525 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 15:58:36,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:38,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:38,478 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:39,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 15:58:41,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:41,557 WARNING [train.py:1197] (2/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:58:43,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:58:43,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 15:58:49,608 INFO [train.py:1039] (2/4) Epoch 22, batch 2850, loss[loss=0.1584, simple_loss=0.2381, pruned_loss=0.03934, over 23414.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2491, pruned_loss=0.04813, over 4720020.15 frames. ], batch size: 119, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:58:49,811 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:49,833 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:58:51,147 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:58:54,046 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:58:57,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:58:59,904 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:59,941 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:59:02,947 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:03,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:59:04,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:59:05,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 15:59:11,914 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 15:59:11,925 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:12,178 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=762760.0, ans=0.0 2023-09-30 15:59:13,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 15:59:14,917 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:15,052 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=762760.0, ans=0.125 2023-09-30 15:59:16,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 15:59:18,007 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 15:59:19,874 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=762760.0, ans=0.0 2023-09-30 15:59:21,038 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:24,695 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=762826.6666666666, ans=0.2 2023-09-30 15:59:32,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:32,977 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:34,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:59:36,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:59:36,461 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:59:36,507 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:59:36,663 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=762826.6666666666, ans=0.1 2023-09-30 15:59:38,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:59:39,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 15:59:41,234 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:59:41,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:59:41,344 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:42,791 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:46,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:46,543 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=762893.3333333334, ans=0.125 2023-09-30 15:59:47,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:49,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:52,207 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:53,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:59:53,835 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:55,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:58,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:00:01,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:00:03,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 16:00:05,223 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 16:00:06,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:00:06,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:06,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 16:00:06,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:00:07,521 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.16 vs. limit=22.5 2023-09-30 16:00:08,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:08,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:10,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:00:10,539 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 16:00:10,607 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 16:00:10,613 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:10,715 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:11,975 INFO [train.py:1039] (2/4) Epoch 22, batch 2900, loss[loss=0.1702, simple_loss=0.2483, pruned_loss=0.04609, over 23434.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2487, pruned_loss=0.04828, over 4714527.24 frames. ], batch size: 134, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 16:00:15,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:16,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:16,733 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:00:18,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 16:00:22,267 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=763026.6666666666, ans=0.125 2023-09-30 16:00:23,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:23,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 16:00:25,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 16:00:26,751 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:00:26,755 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:00:28,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:28,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:00:32,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:33,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:36,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:00:38,055 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 16:00:38,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:00:39,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:43,310 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 16:00:43,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 16:00:43,781 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=763160.0, ans=0.0 2023-09-30 16:00:46,496 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:46,500 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 16:00:46,536 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:00:50,816 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.778e+02 1.944e+02 2.291e+02 4.038e+02, threshold=3.888e+02, percent-clipped=1.0 2023-09-30 16:00:50,944 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:00:50,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:52,666 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:54,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:58,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:01:01,629 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:03,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 16:01:03,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 16:01:03,248 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:01:07,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:01:09,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 16:01:11,684 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:01:16,831 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:01:26,023 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:01:26,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:01:27,520 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 16:01:32,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:32,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 16:01:32,582 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:32,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:01:34,063 INFO [train.py:1039] (2/4) Epoch 22, batch 2950, loss[loss=0.1917, simple_loss=0.2667, pruned_loss=0.05831, over 23315.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2495, pruned_loss=0.04831, over 4714354.00 frames. ], batch size: 105, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:01:37,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:40,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 16:01:42,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:42,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:43,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:01:47,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:01:47,219 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 16:01:48,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 16:01:48,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:01:48,903 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:49,575 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.76 vs. limit=6.0 2023-09-30 16:01:55,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:01:58,625 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:00,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:02:01,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:05,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:05,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:02:09,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:02:12,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 16:02:17,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 16:02:17,132 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 16:02:17,266 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:02:18,852 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 16:02:19,069 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=763493.3333333334, ans=0.125 2023-09-30 16:02:21,717 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 16:02:21,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:22,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:02:22,512 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 16:02:22,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:02:25,534 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 16:02:25,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:25,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:02:28,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:30,440 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:02:30,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:30,542 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 16:02:32,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:32,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 16:02:38,845 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=763626.6666666666, ans=0.07 2023-09-30 16:02:40,083 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:40,642 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.02 vs. limit=15.0 2023-09-30 16:02:42,033 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:02:42,130 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 16:02:42,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:02:43,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 16:02:45,485 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=763626.6666666666, ans=0.125 2023-09-30 16:02:48,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:02:49,747 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:51,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:02:51,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:51,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:02:52,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:02:55,885 INFO [train.py:1039] (2/4) Epoch 22, batch 3000, loss[loss=0.1655, simple_loss=0.2547, pruned_loss=0.0382, over 24447.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2505, pruned_loss=0.04814, over 4722156.23 frames. ], batch size: 69, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:02:55,886 INFO [train.py:1062] (2/4) Computing validation loss 2023-09-30 16:03:05,599 INFO [zipformer.py:1853] (2/4) name=encoder.encoders.3.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.5400, 2.1226, 2.7181, 2.8806, 2.6089, 2.9379, 2.9953, 3.1017], device='cuda:2') 2023-09-30 16:03:10,444 INFO [train.py:1071] (2/4) Epoch 22, validation: loss=0.3133, simple_loss=0.2748, pruned_loss=0.1759, over 1125622.00 frames. 2023-09-30 16:03:10,445 INFO [train.py:1072] (2/4) Maximum memory allocated so far is 21076MB 2023-09-30 16:03:10,505 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:10,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:03:10,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:03:10,708 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:03:12,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:03:12,843 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:12,864 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 16:03:15,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:17,334 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.03 vs. limit=15.0 2023-09-30 16:03:18,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:03:18,212 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:03:21,327 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 16:03:22,742 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 16:03:24,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:03:25,746 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:03:25,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 16:03:26,094 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=763760.0, ans=0.125 2023-09-30 16:03:26,177 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=763760.0, ans=0.125 2023-09-30 16:03:27,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:34,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:03:43,893 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:03:50,838 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.854e+02 2.053e+02 2.240e+02 3.574e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 16:03:52,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 16:03:52,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:03:55,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:03:57,119 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:57,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:04:00,003 WARNING [train.py:1197] (2/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:04:00,008 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 16:04:00,248 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 16:04:02,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:04:03,743 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:04:05,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:04:07,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:07,340 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:07,343 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:04:10,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:04:11,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:04:11,880 WARNING [train.py:1197] (2/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:04:15,002 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:15,412 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=763960.0, ans=0.0 2023-09-30 16:04:16,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 16:04:18,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:04:18,206 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:18,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:04:22,284 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=763960.0, ans=0.0 2023-09-30 16:04:23,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:24,756 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:24,916 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 16:04:24,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 16:04:27,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:04:27,648 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 16:04:29,055 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:04:30,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 16:04:32,389 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:04:32,507 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:04:32,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 16:04:32,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 16:04:32,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:04:34,064 INFO [train.py:1039] (2/4) Epoch 22, batch 3050, loss[loss=0.1678, simple_loss=0.2427, pruned_loss=0.0464, over 24485.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2514, pruned_loss=0.04881, over 4718339.20 frames. ], batch size: 63, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:04:34,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:04:36,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:36,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:04:36,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:38,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:04:40,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 16:04:44,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:04:46,242 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:04:46,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:04:48,194 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=764026.6666666666, ans=0.1 2023-09-30 16:04:49,565 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:53,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 16:04:59,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 16:04:59,289 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 16:05:01,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:04,556 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:05:07,647 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:07,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:07,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:11,370 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:12,341 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.24 vs. limit=15.0 2023-09-30 16:05:13,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:05:13,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:13,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:13,531 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:16,399 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:18,048 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:18,741 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=764160.0, ans=15.0 2023-09-30 16:05:21,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:21,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 16:05:22,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:22,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:05:25,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:05:25,917 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:05:27,394 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:05:27,498 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:27,661 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=764226.6666666666, ans=0.0 2023-09-30 16:05:33,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:35,436 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:40,797 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:40,874 WARNING [train.py:1197] (2/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:05:40,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:41,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=764293.3333333334, ans=0.125 2023-09-30 16:05:42,533 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:44,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:05:44,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:46,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 16:05:49,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:49,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:49,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 16:05:51,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:56,017 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:57,365 INFO [train.py:1039] (2/4) Epoch 22, batch 3100, loss[loss=0.1756, simple_loss=0.2348, pruned_loss=0.05822, over 22545.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.251, pruned_loss=0.04868, over 4716803.74 frames. ], batch size: 322, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:05:58,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:05:59,341 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:06:00,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:06:03,543 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 16:06:07,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 16:06:07,512 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 16:06:10,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:06:13,912 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:06:13,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:17,032 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:06:17,341 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=764426.6666666666, ans=0.125 2023-09-30 16:06:20,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:25,928 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 16:06:31,916 WARNING [train.py:1197] (2/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:06:33,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:33,330 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:06:34,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:06:34,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:06:34,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:06:35,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 16:06:35,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:06:35,252 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=764493.3333333334, ans=0.07 2023-09-30 16:06:36,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:38,093 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.888e+02 2.068e+02 2.233e+02 3.046e+02, threshold=4.136e+02, percent-clipped=0.0 2023-09-30 16:06:38,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 16:06:40,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:06:45,458 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:06:45,544 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 16:06:47,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 16:06:47,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:47,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:50,888 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:06:50,927 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:52,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:06:52,580 WARNING [train.py:1197] (2/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:06:52,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:06:55,405 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:06:55,466 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:06:55,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:55,487 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:07:00,698 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:07:00,919 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=764560.0, ans=0.0 2023-09-30 16:07:02,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 16:07:03,801 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:07:05,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 16:07:06,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:06,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:06,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 16:07:18,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 16:07:20,439 INFO [train.py:1039] (2/4) Epoch 22, batch 3150, loss[loss=0.183, simple_loss=0.2511, pruned_loss=0.0575, over 23734.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2497, pruned_loss=0.04856, over 4721653.01 frames. ], batch size: 164, lr: 4.66e-03, grad_scale: 8.0 2023-09-30 16:07:22,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:22,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:23,941 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:07:23,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:07:24,046 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 16:07:26,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:26,090 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:07:26,289 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=764693.3333333334, ans=0.125 2023-09-30 16:07:27,600 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 16:07:30,946 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:32,433 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 16:07:35,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 16:07:35,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:07:37,084 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 16:07:38,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 16:07:38,817 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=764760.0, ans=0.2 2023-09-30 16:07:40,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 16:07:41,623 WARNING [train.py:1197] (2/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 16:07:41,626 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 16:07:41,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:41,670 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:07:43,277 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:43,871 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=15.0 2023-09-30 16:07:44,859 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 16:07:46,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:46,466 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:48,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:07:50,603 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:07:55,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 16:07:55,195 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:07:58,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:08:00,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:08:00,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 16:08:03,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 16:08:05,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:08:05,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:08:05,345 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:08:06,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:06,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:08:09,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:08:09,661 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:08:09,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 16:08:11,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:08:11,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:12,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:08:12,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:08:12,949 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 16:08:13,181 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=764893.3333333334, ans=0.125 2023-09-30 16:08:14,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:17,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 16:08:17,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:17,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 16:08:19,196 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 16:08:20,677 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:08:20,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:22,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 16:08:24,242 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 16:08:24,324 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:26,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:08:27,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:29,412 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:08:34,515 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:08:34,724 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=764960.0, ans=0.125 2023-09-30 16:08:36,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:38,052 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 16:08:39,731 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=764960.0, ans=0.1 2023-09-30 16:08:39,791 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=764960.0, ans=0.125 2023-09-30 16:08:42,530 INFO [train.py:1039] (2/4) Epoch 22, batch 3200, loss[loss=0.1642, simple_loss=0.2545, pruned_loss=0.03694, over 24300.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2478, pruned_loss=0.04794, over 4715016.51 frames. ], batch size: 74, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:08:44,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:08:44,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 16:08:46,589 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-09-30 16:08:48,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:48,867 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:08:49,093 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:08:50,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 16:08:53,336 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:57,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:09:00,373 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:09:04,549 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=765093.3333333334, ans=0.04949747468305833 2023-09-30 16:09:05,899 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765093.3333333334, ans=0.1 2023-09-30 16:09:08,699 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.63 vs. limit=15.0 2023-09-30 16:09:11,414 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:09:11,633 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=765093.3333333334, ans=0.125 2023-09-30 16:09:20,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 16:09:22,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:09:23,570 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.875e+02 2.082e+02 2.411e+02 3.393e+02, threshold=4.163e+02, percent-clipped=0.0 2023-09-30 16:09:25,315 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 16:09:25,437 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:09:30,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:09:30,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:09:32,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:09:37,328 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 16:09:38,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:09:40,412 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 16:09:44,335 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 16:09:47,685 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:09:53,882 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:53,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:09:54,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=765293.3333333334, ans=0.125 2023-09-30 16:09:55,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:55,423 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 16:09:55,427 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:10:00,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:01,839 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 16:10:01,914 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 16:10:03,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 16:10:04,738 INFO [train.py:1039] (2/4) Epoch 22, batch 3250, loss[loss=0.17, simple_loss=0.2448, pruned_loss=0.04761, over 23572.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.248, pruned_loss=0.04773, over 4726159.73 frames. ], batch size: 256, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:10:04,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 16:10:07,117 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:10:09,072 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=765360.0, ans=0.125 2023-09-30 16:10:10,216 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:10:10,227 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 16:10:10,288 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:10,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:12,406 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 16:10:16,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:10:19,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:27,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:10:27,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 16:10:28,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:30,272 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:10:30,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:31,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:31,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:10:35,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,178 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:10:35,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:35,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:10:40,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:43,422 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:45,064 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:45,098 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:46,629 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:46,698 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:46,714 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:10:52,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 16:10:52,173 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:52,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:10:54,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:56,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:10:56,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=765560.0, ans=0.2 2023-09-30 16:11:02,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:11:09,895 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:09,934 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:09,935 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 16:11:09,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:11:09,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:11:11,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:15,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 16:11:15,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 16:11:15,137 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:11:16,759 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:16,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:17,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=765626.6666666666, ans=0.07 2023-09-30 16:11:18,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:11:18,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:22,927 WARNING [train.py:1197] (2/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:22,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:25,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 16:11:25,070 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:25,343 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=765626.6666666666, ans=0.0 2023-09-30 16:11:28,030 INFO [train.py:1039] (2/4) Epoch 22, batch 3300, loss[loss=0.1844, simple_loss=0.2509, pruned_loss=0.05894, over 23810.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.249, pruned_loss=0.04842, over 4721229.00 frames. ], batch size: 195, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:11:28,092 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:11:28,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 16:11:30,475 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:32,462 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 16:11:34,058 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 16:11:35,620 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 16:11:36,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:40,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:40,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:11:41,678 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:43,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:11:43,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:11:46,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:48,478 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:53,027 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 16:11:53,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:11:53,167 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:56,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:57,488 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 16:11:58,989 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:01,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:12:01,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:12:01,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:01,180 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 16:12:03,596 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=765826.6666666666, ans=0.05 2023-09-30 16:12:04,259 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.13 vs. limit=15.0 2023-09-30 16:12:06,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:06,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:12:07,894 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:07,898 WARNING [train.py:1197] (2/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 16:12:09,732 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.926e+02 2.080e+02 2.384e+02 3.230e+02, threshold=4.160e+02, percent-clipped=0.0 2023-09-30 16:12:09,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 16:12:09,987 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:11,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:12:13,152 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 16:12:14,688 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 16:12:14,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:12:16,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 16:12:19,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:23,132 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:12:23,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:12:27,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:27,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:27,776 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:27,966 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=765893.3333333334, ans=0.0 2023-09-30 16:12:29,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:12:32,160 WARNING [train.py:1197] (2/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:12:32,187 WARNING [train.py:1197] (2/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:33,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:12:36,566 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 16:12:36,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 16:12:39,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:12:39,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:12:39,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:41,085 WARNING [train.py:1197] (2/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:41,087 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:42,650 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:12:44,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:44,173 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:12:46,095 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:46,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:12:48,553 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.80 vs. limit=15.0 2023-09-30 16:12:49,363 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 16:12:49,428 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:49,728 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=766026.6666666666, ans=0.2 2023-09-30 16:12:50,912 INFO [train.py:1039] (2/4) Epoch 22, batch 3350, loss[loss=0.159, simple_loss=0.2398, pruned_loss=0.03912, over 21311.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2498, pruned_loss=0.04871, over 4718816.05 frames. ], batch size: 46, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:12:51,068 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:52,688 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=766026.6666666666, ans=0.125 2023-09-30 16:12:53,958 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:12:54,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:55,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:59,008 WARNING [train.py:1197] (2/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:59,009 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:00,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:13:02,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:03,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:13:05,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:05,597 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=766093.3333333334, ans=0.0 2023-09-30 16:13:08,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:13:10,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:10,423 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:13:12,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 16:13:13,982 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 16:13:14,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:17,305 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=766093.3333333334, ans=0.125 2023-09-30 16:13:18,511 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 16:13:18,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 16:13:18,687 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:13:20,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:13:22,135 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:22,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 16:13:22,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:22,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:13:25,148 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:25,333 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:25,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:26,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:13:27,093 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=766160.0, ans=0.1 2023-09-30 16:13:31,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:35,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:35,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:39,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:13:41,205 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:41,548 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=766226.6666666666, ans=0.0 2023-09-30 16:13:41,621 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=766226.6666666666, ans=0.125 2023-09-30 16:13:41,914 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.14 vs. limit=10.0 2023-09-30 16:13:42,810 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:42,834 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:44,462 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:46,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 16:13:46,683 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:13:46,728 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 16:13:48,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:13:48,288 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 16:13:48,457 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=766226.6666666666, ans=0.0 2023-09-30 16:13:49,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:50,015 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=766226.6666666666, ans=0.125 2023-09-30 16:13:51,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:58,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:59,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 16:13:59,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:00,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:14:01,686 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:14:04,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:06,639 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=766293.3333333334, ans=0.125 2023-09-30 16:14:08,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 16:14:08,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:14:08,628 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:14:10,166 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:11,647 WARNING [train.py:1197] (2/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 16:14:13,629 INFO [train.py:1039] (2/4) Epoch 22, batch 3400, loss[loss=0.1577, simple_loss=0.2329, pruned_loss=0.04128, over 24563.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2513, pruned_loss=0.04988, over 4709333.62 frames. ], batch size: 60, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:14:13,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:14:13,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 16:14:15,822 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:17,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:17,255 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:14:18,761 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:14:18,797 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 16:14:24,802 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 16:14:24,819 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 16:14:24,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:14:28,621 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:28,631 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:30,190 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:31,712 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:14:37,597 WARNING [train.py:1197] (2/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:14:39,100 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 16:14:44,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:14:46,555 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.72 vs. limit=22.5 2023-09-30 16:14:47,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:47,406 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:48,972 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 16:14:53,679 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=766493.3333333334, ans=0.2 2023-09-30 16:14:56,490 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:14:57,966 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.853e+02 2.034e+02 2.228e+02 2.939e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 16:14:58,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 16:15:04,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,401 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 16:15:07,913 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:07,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:09,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:15:09,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:15:11,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:15:16,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:15:16,175 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:15:21,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:23,324 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 16:15:28,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:15:33,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 16:15:36,871 INFO [train.py:1039] (2/4) Epoch 22, batch 3450, loss[loss=0.1688, simple_loss=0.2605, pruned_loss=0.03856, over 24423.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2513, pruned_loss=0.04932, over 4726310.99 frames. ], batch size: 69, lr: 4.65e-03, grad_scale: 4.0 2023-09-30 16:15:36,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 16:15:37,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:40,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:15:40,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 16:15:40,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:43,372 WARNING [train.py:1197] (2/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:15:51,585 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:15:53,133 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:15:54,598 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:15:54,615 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:56,883 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:00,388 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=766760.0, ans=0.125 2023-09-30 16:16:03,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 16:16:05,495 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=766760.0, ans=0.2 2023-09-30 16:16:10,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 16:16:10,318 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:16:10,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:16:13,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:15,247 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=766826.6666666666, ans=0.0 2023-09-30 16:16:18,094 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 16:16:19,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:16:22,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:22,969 WARNING [train.py:1197] (2/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:16:24,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:16:26,758 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:16:28,434 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 16:16:28,451 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:16:30,701 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:33,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:16:37,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 16:16:41,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:16:48,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:16:50,088 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:53,215 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:16:56,432 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:57,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:57,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:16:59,264 INFO [train.py:1039] (2/4) Epoch 22, batch 3500, loss[loss=0.1576, simple_loss=0.2408, pruned_loss=0.03717, over 24488.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.25, pruned_loss=0.04916, over 4707588.82 frames. ], batch size: 66, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:16:59,350 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:17:04,169 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.96 vs. limit=6.0 2023-09-30 16:17:04,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:07,715 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:17:09,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 16:17:11,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:17:13,457 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:17:15,654 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.35 vs. limit=15.0 2023-09-30 16:17:16,663 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:16,686 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 16:17:21,789 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:17:21,930 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:17:23,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:17:23,576 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:24,964 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:17:25,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:26,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:26,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 16:17:29,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:30,901 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:17:32,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:36,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:37,638 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 16:17:37,680 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:39,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:42,515 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:17:43,241 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.29 vs. limit=15.0 2023-09-30 16:17:44,406 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.347e+02 1.810e+02 1.991e+02 2.339e+02 3.631e+02, threshold=3.981e+02, percent-clipped=0.0 2023-09-30 16:17:44,547 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:46,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:17:46,113 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:48,352 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 16:17:48,495 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 16:17:49,968 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 16:17:52,029 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:53,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:53,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:53,789 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:17:54,025 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=767226.6666666666, ans=0.0 2023-09-30 16:17:58,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:17:58,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:18:03,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:04,854 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 16:18:04,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 16:18:04,865 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:07,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:10,019 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:11,610 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:13,203 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 16:18:13,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:14,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:18:16,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 16:18:18,452 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 16:18:20,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:21,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:21,972 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:22,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:24,015 INFO [train.py:1039] (2/4) Epoch 22, batch 3550, loss[loss=0.1747, simple_loss=0.2523, pruned_loss=0.04853, over 24457.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2494, pruned_loss=0.04855, over 4716199.30 frames. ], batch size: 66, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:18:25,774 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:18:34,795 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:34,984 WARNING [train.py:1197] (2/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 16:18:39,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:39,535 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:18:43,179 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:43,300 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:18:44,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:18:46,520 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=767426.6666666666, ans=0.2 2023-09-30 16:18:47,817 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:49,305 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:18:49,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:50,803 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:18:50,939 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:18:51,216 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=767426.6666666666, ans=0.125 2023-09-30 16:18:58,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:18:59,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:19:00,028 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:00,036 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:01,459 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:19:01,502 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 16:19:01,519 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:03,060 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:04,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:19:10,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:10,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:19:12,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:15,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 16:19:15,960 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:19:17,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 16:19:17,524 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:19,071 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:19:19,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:19:23,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 16:19:25,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:30,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:32,638 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 16:19:34,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:36,066 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=767626.6666666666, ans=0.0 2023-09-30 16:19:37,365 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:38,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 16:19:44,985 INFO [train.py:1039] (2/4) Epoch 22, batch 3600, loss[loss=0.1615, simple_loss=0.2348, pruned_loss=0.04407, over 19362.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2493, pruned_loss=0.04827, over 4718326.65 frames. ], batch size: 42, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:19:46,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 16:19:46,573 WARNING [train.py:1197] (2/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:19:46,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:19:48,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:50,399 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:51,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:19:53,881 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=767693.3333333334, ans=0.125 2023-09-30 16:19:55,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:56,633 WARNING [train.py:1197] (2/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:58,116 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:19:58,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:19:58,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:59,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 16:20:02,080 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:20:04,144 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:06,464 WARNING [train.py:1197] (2/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:09,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:11,101 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:20:11,160 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:20:11,201 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 16:20:12,732 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:15,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:17,208 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:20:18,876 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:22,012 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:24,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:25,634 WARNING [train.py:1197] (2/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 16:20:29,172 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=767826.6666666666, ans=0.0 2023-09-30 16:20:30,132 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.855e+02 2.093e+02 2.539e+02 3.867e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 16:20:31,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:20:33,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:20:33,480 WARNING [train.py:1197] (2/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 16:20:39,260 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:20:44,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=767893.3333333334, ans=0.07 2023-09-30 16:20:45,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:47,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:51,188 WARNING [train.py:1197] (2/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:20:51,211 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:20:51,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 16:20:52,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 16:20:54,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 16:20:57,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:57,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:20:57,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 16:20:59,493 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:20:59,546 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:20:59,560 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:21:01,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 16:21:04,151 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 16:21:05,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:21:06,002 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 16:21:09,573 INFO [train.py:1039] (2/4) Epoch 22, batch 3650, loss[loss=0.1679, simple_loss=0.255, pruned_loss=0.04039, over 24641.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2499, pruned_loss=0.04884, over 4720334.14 frames. ], batch size: 73, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:21:13,375 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 16:21:13,539 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:21:17,373 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=768026.6666666666, ans=0.0 2023-09-30 16:21:20,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 16:21:21,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 16:21:26,310 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:21:26,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:21:27,901 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:21:30,532 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.13 vs. limit=15.0 2023-09-30 16:21:31,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:21:31,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:21:32,699 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 16:21:34,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:21:34,241 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:21:34,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 16:21:36,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:21:36,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:21:36,542 WARNING [train.py:1197] (2/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:39,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:21:43,274 WARNING [train.py:1197] (2/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 16:21:43,920 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-09-30 16:21:44,669 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 16:21:46,146 WARNING [train.py:1197] (2/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:21:47,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 16:21:49,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:21:49,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:21:56,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:21:58,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:58,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:21:59,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:22:01,307 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:22:03,044 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=768226.6666666666, ans=0.2 2023-09-30 16:22:04,360 WARNING [train.py:1197] (2/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:22:06,061 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:07,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:07,604 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:22:10,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:22:12,383 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:22:12,484 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:18,707 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 16:22:22,368 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:22,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:25,406 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.98 vs. limit=6.0 2023-09-30 16:22:25,734 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:22:25,812 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:27,944 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:22:29,601 WARNING [train.py:1197] (2/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:31,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 16:22:31,186 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:32,626 INFO [train.py:1039] (2/4) Epoch 22, batch 3700, loss[loss=0.1559, simple_loss=0.2448, pruned_loss=0.03347, over 24541.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.25, pruned_loss=0.04867, over 4727987.05 frames. ], batch size: 71, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:22:32,846 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:22:35,820 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:37,301 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:22:40,410 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:40,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 16:22:40,425 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:40,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:22:41,998 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:22:46,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:22:48,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:48,947 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:50,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:22:51,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:51,997 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:22:52,631 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.36 vs. limit=15.0 2023-09-30 16:22:55,036 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:55,421 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=768426.6666666666, ans=0.125 2023-09-30 16:22:57,106 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 16:23:01,184 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=768426.6666666666, ans=0.0 2023-09-30 16:23:05,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:23:07,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:23:07,366 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:23:07,393 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 16:23:07,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:09,160 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=768493.3333333334, ans=0.0 2023-09-30 16:23:09,230 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=768493.3333333334, ans=0.0 2023-09-30 16:23:09,336 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=768493.3333333334, ans=0.0 2023-09-30 16:23:12,041 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:12,181 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 16:23:13,659 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:15,165 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:23:16,433 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.838e+02 2.020e+02 2.456e+02 4.416e+02, threshold=4.040e+02, percent-clipped=2.0 2023-09-30 16:23:18,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:18,184 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:23:20,339 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=768560.0, ans=0.2 2023-09-30 16:23:20,452 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=768560.0, ans=0.125 2023-09-30 16:23:21,729 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:23:25,388 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.58 vs. limit=15.0 2023-09-30 16:23:26,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:26,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 16:23:27,858 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:23:27,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 16:23:29,632 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=768560.0, ans=0.0 2023-09-30 16:23:32,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:23:32,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:23:35,482 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=768560.0, ans=0.125 2023-09-30 16:23:36,617 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:36,681 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 16:23:40,516 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:23:40,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:23:40,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:41,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:45,210 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:46,586 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 16:23:46,773 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 16:23:48,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:23:48,294 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:23:48,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:23:49,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:23:53,508 WARNING [train.py:1197] (2/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:54,766 INFO [train.py:1039] (2/4) Epoch 22, batch 3750, loss[loss=0.2336, simple_loss=0.2946, pruned_loss=0.08628, over 19435.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2503, pruned_loss=0.04874, over 4732460.69 frames. ], batch size: 388, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:23:54,943 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:23:56,447 WARNING [train.py:1197] (2/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:23:59,320 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 16:24:00,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:24:02,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:24:04,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 16:24:04,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:24:05,622 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:05,775 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:09,143 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:14,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:17,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:24:19,053 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:24:20,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:24:24,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:25,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 16:24:26,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:28,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:28,153 WARNING [train.py:1197] (2/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:31,886 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 16:24:35,010 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 16:24:36,465 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:37,857 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:40,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:42,717 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=768893.3333333334, ans=0.125 2023-09-30 16:24:46,644 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:48,168 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 16:24:50,101 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=768893.3333333334, ans=0.125 2023-09-30 16:24:51,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 16:24:53,154 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:56,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:56,371 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:24:56,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=768893.3333333334, ans=0.2 2023-09-30 16:25:00,921 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:25:06,126 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:25:06,826 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=12.0 2023-09-30 16:25:07,713 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:25:08,719 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.75 vs. limit=15.0 2023-09-30 16:25:09,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:25:10,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:25:11,213 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=768960.0, ans=0.1 2023-09-30 16:25:14,027 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:25:16,047 INFO [train.py:1039] (2/4) Epoch 22, batch 3800, loss[loss=0.1905, simple_loss=0.2726, pruned_loss=0.05424, over 24384.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2497, pruned_loss=0.04852, over 4726220.70 frames. ], batch size: 77, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:25:23,501 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:25:26,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:28,195 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:25:28,522 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=769026.6666666666, ans=10.0 2023-09-30 16:25:29,641 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 16:25:31,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:31,398 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:31,740 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=769093.3333333334, ans=0.125 2023-09-30 16:25:32,954 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:25:36,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 16:25:36,500 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:36,635 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:25:38,189 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:38,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:25:39,624 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:41,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 16:25:44,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 16:25:44,263 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:25:47,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:51,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:25:51,780 WARNING [train.py:1197] (2/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:25:53,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:25:53,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:56,386 WARNING [train.py:1197] (2/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:56,540 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:26:00,869 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.761e+02 1.975e+02 2.225e+02 3.481e+02, threshold=3.951e+02, percent-clipped=0.0 2023-09-30 16:26:02,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:26:02,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 16:26:02,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:11,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:11,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=769226.6666666666, ans=0.125 2023-09-30 16:26:15,936 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:26:17,492 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 16:26:19,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 16:26:19,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:22,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:24,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:24,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 16:26:29,697 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 16:26:29,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 16:26:29,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:32,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:39,670 INFO [train.py:1039] (2/4) Epoch 22, batch 3850, loss[loss=0.1667, simple_loss=0.2431, pruned_loss=0.04517, over 24464.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2483, pruned_loss=0.04858, over 4710090.96 frames. ], batch size: 63, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:26:39,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:26:39,881 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:26:44,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:26:44,648 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 16:26:46,217 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:26:46,361 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:49,558 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:26:49,835 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=769360.0, ans=0.125 2023-09-30 16:26:53,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:54,842 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:26:56,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 16:27:03,959 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:07,000 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:27:09,255 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:10,721 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:27:12,394 WARNING [train.py:1197] (2/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:12,504 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:27:12,591 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:12,612 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:27:14,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:15,817 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:15,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:17,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:27:17,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 16:27:17,561 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 16:27:19,024 WARNING [train.py:1197] (2/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:19,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:22,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:22,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:23,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 16:27:25,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 16:27:28,848 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:30,976 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 16:27:32,645 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:27:39,345 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:40,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:44,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:46,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 16:27:47,777 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 16:27:50,689 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:52,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:53,862 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:27:53,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:27:55,312 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,443 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,444 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:27:55,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 16:27:56,911 WARNING [train.py:1197] (2/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:58,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 16:27:58,529 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:58,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:02,247 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:28:03,508 INFO [train.py:1039] (2/4) Epoch 22, batch 3900, loss[loss=0.1609, simple_loss=0.2212, pruned_loss=0.05027, over 19439.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2476, pruned_loss=0.04829, over 4696476.16 frames. ], batch size: 388, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:28:03,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:03,803 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=769693.3333333334, ans=0.1 2023-09-30 16:28:05,100 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:28:05,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:05,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:28:05,302 WARNING [train.py:1197] (2/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:07,229 WARNING [train.py:1197] (2/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 16:28:07,326 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:11,828 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:12,720 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=12.09 vs. limit=15.0 2023-09-30 16:28:13,865 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:13,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:28:15,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:17,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:19,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:20,729 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:28:22,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 16:28:22,317 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:23,910 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 16:28:23,959 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:25,429 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 16:28:27,006 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 16:28:30,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:31,702 WARNING [train.py:1197] (2/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:31,724 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:28:33,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:28:39,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:41,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:28:46,367 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:28:46,379 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:28:48,304 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.859e+02 2.113e+02 2.353e+02 3.355e+02, threshold=4.226e+02, percent-clipped=0.0 2023-09-30 16:28:48,409 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:28:55,172 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:55,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:29:02,838 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:29:03,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:29:07,812 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=769960.0, ans=0.125 2023-09-30 16:29:14,162 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:14,476 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=769960.0, ans=0.0 2023-09-30 16:29:15,741 WARNING [train.py:1197] (2/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:17,705 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 16:29:17,773 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 16:29:19,253 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:19,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 16:29:21,030 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:29:22,443 WARNING [train.py:1197] (2/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 16:29:25,802 INFO [train.py:1039] (2/4) Epoch 22, batch 3950, loss[loss=0.1609, simple_loss=0.2361, pruned_loss=0.04285, over 23506.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2476, pruned_loss=0.04769, over 4710375.80 frames. ], batch size: 134, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:29:29,660 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:29:31,138 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 16:29:31,230 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:29:34,258 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:29:37,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:29:43,241 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 16:29:43,335 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:43,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 16:29:44,787 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 16:29:44,826 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:48,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:48,104 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:29:48,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:52,374 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 16:29:54,011 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:29:55,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:55,431 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:29:55,523 WARNING [train.py:1197] (2/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:29:56,967 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:30:07,482 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:30:08,849 WARNING [train.py:1197] (2/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:30:13,541 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 16:30:15,290 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=770226.6666666666, ans=0.125 2023-09-30 16:30:19,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 16:30:19,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 16:30:19,593 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:30:21,044 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:30:29,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:30:29,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:30:31,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:30:31,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:30:31,578 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 16:30:37,723 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:30:39,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:30:43,896 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 16:30:44,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=770293.3333333334, ans=0.0 2023-09-30 16:30:48,452 INFO [train.py:1039] (2/4) Epoch 22, batch 4000, loss[loss=0.1989, simple_loss=0.279, pruned_loss=0.05944, over 24324.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2484, pruned_loss=0.04789, over 4711269.94 frames. ], batch size: 77, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:30:53,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:02,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:06,877 WARNING [train.py:1197] (2/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:06,978 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:31:08,437 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:08,470 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 16:31:08,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:31:10,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 16:31:10,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:31:10,703 WARNING [train.py:1197] (2/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 16:31:11,265 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-09-30 16:31:12,990 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=770426.6666666666, ans=0.1 2023-09-30 16:31:14,252 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:17,445 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:31:17,467 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:31:17,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:31:17,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:17,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:31:17,808 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=770426.6666666666, ans=0.125 2023-09-30 16:31:17,912 INFO [scaling.py:1118] (2/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:31:19,107 WARNING [train.py:1197] (2/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:31:20,675 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 16:31:20,805 WARNING [train.py:1197] (2/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:31:22,278 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:23,910 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 16:31:25,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:31:25,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:32,049 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 16:31:33,543 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:34,784 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.863e+02 2.026e+02 2.291e+02 3.253e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-30 16:31:38,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:31:38,561 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=770560.0, ans=0.1 2023-09-30 16:31:39,741 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 16:31:41,321 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:31:42,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 16:31:42,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:31:44,361 WARNING [train.py:1197] (2/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:44,483 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:31:46,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:31:46,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:31:48,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:49,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 16:31:49,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:49,326 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=770560.0, ans=0.1 2023-09-30 16:31:50,780 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 16:31:55,497 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:32:00,021 WARNING [train.py:1197] (2/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:32:01,656 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:32:01,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:03,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:32:03,321 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:03,675 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=770626.6666666666, ans=0.0 2023-09-30 16:32:04,015 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.62 vs. limit=15.0 2023-09-30 16:32:07,514 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=770626.6666666666, ans=0.125 2023-09-30 16:32:08,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:11,921 INFO [train.py:1039] (2/4) Epoch 22, batch 4050, loss[loss=0.2277, simple_loss=0.2866, pruned_loss=0.08443, over 19404.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2488, pruned_loss=0.04817, over 4709383.70 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:32:13,430 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:32:13,701 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=770693.3333333334, ans=0.2 2023-09-30 16:32:14,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 16:32:15,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:32:16,509 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:32:16,652 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:32:18,155 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:18,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:24,273 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:27,454 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=770760.0, ans=0.0 2023-09-30 16:32:28,571 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:32:30,084 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:32:31,683 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:32:32,541 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.88 vs. limit=22.5 2023-09-30 16:32:33,102 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:32:36,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:36,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:39,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 16:32:41,863 WARNING [train.py:1197] (2/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 16:32:43,851 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 16:32:46,799 WARNING [train.py:1197] (2/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:32:54,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 16:32:55,720 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:33:01,145 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:03,558 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=770893.3333333334, ans=0.125 2023-09-30 16:33:04,950 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:33:05,030 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:33:06,413 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:09,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:33:11,110 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 16:33:11,123 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:33:12,700 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:14,191 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 16:33:19,912 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:21,843 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=770960.0, ans=0.0 2023-09-30 16:33:26,115 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 16:33:27,587 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:33:27,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:33:28,645 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.85 vs. limit=15.0 2023-09-30 16:33:30,639 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 16:33:30,653 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 16:33:30,655 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:34,169 INFO [train.py:1039] (2/4) Epoch 22, batch 4100, loss[loss=0.2367, simple_loss=0.2965, pruned_loss=0.08849, over 19517.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2507, pruned_loss=0.04905, over 4697401.25 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:33:34,297 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:33:36,364 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:36,391 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:33:43,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 16:33:45,498 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 16:33:47,112 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 16:33:48,748 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 16:33:48,768 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:48,848 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,210 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:33:51,809 WARNING [train.py:1197] (2/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 16:33:55,599 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:33:55,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:33:55,756 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:57,190 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:34:00,450 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:34:01,970 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:34:02,045 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:34:03,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 16:34:03,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:03,517 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:34:03,537 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:03,567 WARNING [train.py:1197] (2/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:34:04,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 16:34:08,676 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:08,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 16:34:10,337 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:34:11,929 WARNING [train.py:1197] (2/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:11,931 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 16:34:13,402 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:34:14,867 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:34:16,223 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:34:17,769 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 16:34:18,013 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=771160.0, ans=0.025 2023-09-30 16:34:18,034 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=771160.0, ans=0.04949747468305833 2023-09-30 16:34:19,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:34:20,903 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:34:21,212 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=771226.6666666666, ans=0.2 2023-09-30 16:34:21,729 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.60 vs. limit=6.0 2023-09-30 16:34:22,266 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.433e+02 1.834e+02 2.084e+02 2.360e+02 3.426e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 16:34:23,179 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 16:34:24,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:24,613 WARNING [train.py:1197] (2/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:27,636 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:35,152 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:34:35,605 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=771226.6666666666, ans=0.125 2023-09-30 16:34:38,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:38,555 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=771293.3333333334, ans=0.2 2023-09-30 16:34:39,866 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:34:46,054 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771293.3333333334, ans=0.1 2023-09-30 16:34:48,818 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:34:48,841 WARNING [train.py:1197] (2/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:52,223 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=771293.3333333334, ans=0.125 2023-09-30 16:34:53,306 WARNING [train.py:1197] (2/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:53,549 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:34:54,882 INFO [train.py:1039] (2/4) Epoch 22, batch 4150, loss[loss=0.1704, simple_loss=0.2413, pruned_loss=0.04969, over 18309.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2505, pruned_loss=0.04869, over 4702349.64 frames. ], batch size: 40, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:34:56,680 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:58,612 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:34:58,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:34:58,735 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:35:02,356 WARNING [train.py:1197] (2/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 16:35:02,411 WARNING [train.py:1197] (2/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:03,821 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 16:35:03,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 16:35:03,961 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 16:35:05,920 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=771360.0, ans=15.0 2023-09-30 16:35:06,880 WARNING [train.py:1197] (2/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:10,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:35:10,259 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:15,381 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:17,538 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:18,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:35:19,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:35:19,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:35:20,783 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:35:25,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:25,682 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=771426.6666666666, ans=0.125 2023-09-30 16:35:27,287 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=771493.3333333334, ans=0.125 2023-09-30 16:35:28,660 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:30,051 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 16:35:32,311 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 16:35:32,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:35:35,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 16:35:35,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:35:35,378 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:38,408 WARNING [train.py:1197] (2/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:39,942 WARNING [train.py:1197] (2/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:43,164 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 16:35:45,055 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=771560.0, ans=0.125 2023-09-30 16:35:47,052 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:35:48,658 WARNING [train.py:1197] (2/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:35:48,775 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 16:35:50,736 WARNING [train.py:1197] (2/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:52,276 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 16:35:53,006 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.61 vs. limit=15.0 2023-09-30 16:35:53,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:35:55,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:56,806 WARNING [train.py:1197] (2/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:58,392 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 16:35:58,392 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:58,395 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:36:00,042 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:36:02,249 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.48 vs. limit=15.0 2023-09-30 16:36:03,025 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 16:36:04,348 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:04,354 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:36:04,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:36:05,133 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 16:36:05,169 WARNING [train.py:1197] (2/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:36:05,221 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:36:06,577 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:36:08,198 WARNING [train.py:1197] (2/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:08,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 16:36:09,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:36:10,525 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.03 vs. limit=15.0 2023-09-30 16:36:15,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:36:15,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 16:36:16,245 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=771693.3333333334, ans=0.0 2023-09-30 16:36:17,299 INFO [train.py:1039] (2/4) Epoch 22, batch 4200, loss[loss=0.1578, simple_loss=0.2353, pruned_loss=0.0401, over 24452.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2494, pruned_loss=0.04873, over 4698549.53 frames. ], batch size: 58, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:36:19,483 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:36:19,995 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=771693.3333333334, ans=0.125 2023-09-30 16:36:23,102 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:23,254 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:36:24,760 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:24,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:27,779 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 16:36:30,919 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 16:36:30,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:33,962 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:36,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:36:40,485 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:36:40,699 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:36:40,738 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:42,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 16:36:42,839 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:44,419 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:45,957 WARNING [train.py:1197] (2/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:45,986 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:36:47,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:36:50,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 16:36:51,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:57,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:36:57,350 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:37:00,351 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:37:00,530 WARNING [train.py:1197] (2/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:03,618 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:37:03,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 16:37:03,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:05,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:37:06,545 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.822e+02 2.029e+02 2.265e+02 3.199e+02, threshold=4.057e+02, percent-clipped=0.0 2023-09-30 16:37:07,469 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.74 vs. limit=15.0 2023-09-30 16:37:11,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:37:12,409 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=771893.3333333334, ans=0.125 2023-09-30 16:37:13,555 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:20,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:37:23,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 16:37:23,654 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=771960.0, ans=0.125 2023-09-30 16:37:24,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:30,240 WARNING [train.py:1197] (2/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:37:30,348 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:34,514 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 16:37:35,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=771960.0, ans=0.0 2023-09-30 16:37:37,795 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:37:37,939 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=771960.0, ans=0.125 2023-09-30 16:37:40,678 INFO [train.py:1039] (2/4) Epoch 22, batch 4250, loss[loss=0.1466, simple_loss=0.1978, pruned_loss=0.04765, over 19148.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2471, pruned_loss=0.04856, over 4688253.67 frames. ], batch size: 388, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:37:42,872 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:43,102 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=772026.6666666666, ans=0.125 2023-09-30 16:37:44,327 WARNING [train.py:1197] (2/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:37:47,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:50,581 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:37:50,646 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 16:37:50,780 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=772026.6666666666, ans=0.125 2023-09-30 16:37:52,832 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:53,558 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-09-30 16:37:55,827 WARNING [train.py:1197] (2/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:58,923 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:38:02,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:03,891 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:06,268 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:38:06,272 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:07,846 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:09,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:10,812 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:12,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:38:14,038 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:15,005 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=772160.0, ans=0.0 2023-09-30 16:38:16,017 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 16:38:19,506 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=772160.0, ans=0.2 2023-09-30 16:38:20,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 16:38:20,491 WARNING [train.py:1197] (2/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:20,599 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:38:20,637 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:20,825 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=772160.0, ans=0.1 2023-09-30 16:38:22,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:38:22,164 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:22,238 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:26,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 16:38:28,898 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:38:30,712 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=772226.6666666666, ans=0.125 2023-09-30 16:38:32,219 WARNING [train.py:1197] (2/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:33,675 WARNING [train.py:1197] (2/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:35,139 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 16:38:35,152 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:38:37,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 16:38:38,851 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:38:40,996 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:38:41,681 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.08 vs. limit=15.0 2023-09-30 16:38:44,218 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:44,262 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:45,945 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 16:38:47,506 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:38:47,596 WARNING [train.py:1197] (2/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:38:52,922 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:55,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:55,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:38:57,456 WARNING [train.py:1197] (2/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:59,097 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:01,358 WARNING [train.py:1197] (2/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:39:01,452 WARNING [train.py:1197] (2/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:01,463 WARNING [train.py:1197] (2/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 16:39:02,993 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:04,422 INFO [train.py:1039] (2/4) Epoch 22, batch 4300, loss[loss=0.1671, simple_loss=0.2599, pruned_loss=0.03713, over 24268.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2478, pruned_loss=0.04823, over 4702944.89 frames. ], batch size: 74, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:39:06,376 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=772360.0, ans=0.0 2023-09-30 16:39:09,091 WARNING [train.py:1197] (2/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:09,317 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=772360.0, ans=0.0 2023-09-30 16:39:10,460 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:14,304 WARNING [train.py:1197] (2/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:22,156 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=772426.6666666666, ans=0.2 2023-09-30 16:39:23,420 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:39:23,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 16:39:24,991 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:39:26,560 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:39:26,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:39:26,609 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 16:39:31,654 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:39:33,256 WARNING [train.py:1197] (2/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:39:38,472 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 16:39:38,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:39:38,530 WARNING [train.py:1197] (2/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 16:39:41,668 WARNING [train.py:1197] (2/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:39:41,862 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:39:43,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:39:43,534 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:45,023 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:39:48,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:48,722 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:48,883 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=772493.3333333334, ans=0.125 2023-09-30 16:39:50,062 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 16:39:50,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 16:39:51,850 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:39:53,167 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.888e+02 2.133e+02 2.440e+02 3.863e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 16:39:54,979 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:54,992 WARNING [train.py:1197] (2/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:39:55,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:55,079 WARNING [train.py:1197] (2/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:55,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 16:39:55,099 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 16:39:56,675 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 16:39:58,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:39:58,207 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 16:39:58,253 WARNING [train.py:1197] (2/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 16:40:02,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:05,151 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 16:40:05,234 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:40:08,793 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:08,802 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:11,830 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 16:40:11,937 WARNING [train.py:1197] (2/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:40:11,943 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:13,329 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:13,391 WARNING [train.py:1197] (2/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:13,475 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:40:16,518 WARNING [train.py:1197] (2/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:40:18,204 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:18,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:20,140 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:25,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 16:40:26,609 INFO [train.py:1039] (2/4) Epoch 22, batch 4350, loss[loss=0.1875, simple_loss=0.2562, pruned_loss=0.05942, over 23733.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2484, pruned_loss=0.04812, over 4709911.73 frames. ], batch size: 149, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:40:26,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:40:31,256 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:40:34,328 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:36,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:40:36,022 WARNING [train.py:1197] (2/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:40:36,940 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=772693.3333333334, ans=0.125 2023-09-30 16:40:41,845 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:40:44,885 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:46,564 WARNING [train.py:1197] (2/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:40:46,588 WARNING [train.py:1197] (2/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:49,678 WARNING [train.py:1197] (2/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:40:53,192 WARNING [train.py:1197] (2/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:40:55,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:40:56,298 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.53 vs. limit=15.0 2023-09-30 16:41:00,285 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 16:41:01,762 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:01,888 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:07,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:11,424 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 16:41:13,682 WARNING [train.py:1197] (2/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:15,193 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:41:19,795 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 16:41:21,332 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:21,418 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:41:22,928 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 16:41:24,411 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 16:41:24,434 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:24,479 WARNING [train.py:1197] (2/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:24,594 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:41:25,955 WARNING [train.py:1197] (2/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:27,896 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:27,977 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:41:31,109 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 16:41:31,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:31,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:32,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:32,652 WARNING [train.py:1197] (2/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 16:41:35,530 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 16:41:35,537 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 16:41:35,555 WARNING [train.py:1197] (2/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 16:41:38,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:41:38,785 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:41:38,815 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:41:40,271 WARNING [train.py:1197] (2/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:41:41,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 16:41:45,525 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 16:41:45,537 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:49,149 INFO [train.py:1039] (2/4) Epoch 22, batch 4400, loss[loss=0.185, simple_loss=0.2652, pruned_loss=0.05237, over 24364.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2497, pruned_loss=0.04868, over 4716546.07 frames. ], batch size: 77, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:41:49,347 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:41:49,363 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:53,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:55,313 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 16:41:55,358 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 16:41:56,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 16:41:56,883 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 16:41:58,357 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:41:58,377 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:42:01,308 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 16:42:05,641 WARNING [train.py:1197] (2/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:07,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:07,137 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 16:42:08,908 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:08,909 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 16:42:08,995 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 16:42:09,723 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.25 vs. limit=15.0 2023-09-30 16:42:12,277 WARNING [train.py:1197] (2/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 16:42:12,584 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=773093.3333333334, ans=0.125 2023-09-30 16:42:13,728 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 16:42:13,770 WARNING [train.py:1197] (2/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 16:42:13,828 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:15,317 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:15,390 WARNING [train.py:1197] (2/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:18,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:18,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 16:42:18,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 16:42:18,758 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=773093.3333333334, ans=0.0 2023-09-30 16:42:19,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:22,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:42:22,167 WARNING [train.py:1197] (2/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:23,714 WARNING [train.py:1197] (2/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:25,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:25,245 WARNING [train.py:1197] (2/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 16:42:25,394 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 16:42:28,920 WARNING [train.py:1197] (2/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:30,623 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=773160.0, ans=0.125 2023-09-30 16:42:32,078 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=773160.0, ans=0.035 2023-09-30 16:42:35,058 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=773160.0, ans=0.2 2023-09-30 16:42:36,371 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:38,239 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.760e+02 1.970e+02 2.320e+02 3.797e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 16:42:39,906 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 16:42:40,570 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.95 vs. limit=22.5 2023-09-30 16:42:44,448 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:42:46,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:42:46,350 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=773226.6666666666, ans=0.125 2023-09-30 16:42:47,749 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:42:49,225 WARNING [train.py:1197] (2/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 16:42:49,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:42:49,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:42:49,292 WARNING [train.py:1197] (2/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:42:50,868 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:42:54,141 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=773293.3333333334, ans=0.0 2023-09-30 16:42:56,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 16:42:58,870 WARNING [train.py:1197] (2/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 16:43:00,988 WARNING [train.py:1197] (2/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 16:43:01,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:01,037 WARNING [train.py:1197] (2/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 16:43:02,600 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:43:08,915 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:43:11,820 WARNING [train.py:1197] (2/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 16:43:13,082 INFO [train.py:1039] (2/4) Epoch 22, batch 4450, loss[loss=0.1868, simple_loss=0.2567, pruned_loss=0.05844, over 23412.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2501, pruned_loss=0.04873, over 4727967.93 frames. ], batch size: 119, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:43:16,932 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:43:18,677 WARNING [train.py:1197] (2/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:20,196 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:43:25,000 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:43:25,040 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:43:28,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:29,894 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=773426.6666666666, ans=0.07 2023-09-30 16:43:31,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:43:33,194 WARNING [train.py:1197] (2/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:43:33,235 WARNING [train.py:1197] (2/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:33,961 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.47 vs. limit=10.0 2023-09-30 16:43:36,716 WARNING [train.py:1197] (2/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 16:43:36,719 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:36,843 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:36,897 WARNING [train.py:1197] (2/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:43:36,899 WARNING [train.py:1197] (2/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:43:39,939 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:43:47,387 WARNING [train.py:1197] (2/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:47,470 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:48,942 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:51,080 WARNING [train.py:1197] (2/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:51,227 WARNING [train.py:1197] (2/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:43:55,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=773493.3333333334, ans=0.0 2023-09-30 16:43:56,471 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:43:58,020 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 16:43:58,049 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 16:43:58,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:44:00,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:02,510 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 16:44:05,783 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=773560.0, ans=0.125 2023-09-30 16:44:06,975 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:44:10,572 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:10,664 WARNING [train.py:1197] (2/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 16:44:10,704 WARNING [train.py:1197] (2/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:10,710 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:12,135 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:44:12,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:13,706 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:16,791 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:44:16,853 WARNING [train.py:1197] (2/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 16:44:18,406 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:44:19,980 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:44:21,499 WARNING [train.py:1197] (2/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:21,789 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=773626.6666666666, ans=0.125 2023-09-30 16:44:23,075 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:23,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:44:26,539 WARNING [train.py:1197] (2/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:44:30,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 16:44:31,907 WARNING [train.py:1197] (2/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:44:34,938 INFO [train.py:1039] (2/4) Epoch 22, batch 4500, loss[loss=0.1409, simple_loss=0.2151, pruned_loss=0.03335, over 19199.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2498, pruned_loss=0.04846, over 4736326.64 frames. ], batch size: 42, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:44:36,646 WARNING [train.py:1197] (2/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:36,807 WARNING [train.py:1197] (2/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 16:44:36,809 WARNING [train.py:1197] (2/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 16:44:39,892 WARNING [train.py:1197] (2/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:44:45,258 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:46,665 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:46,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:44:48,233 WARNING [train.py:1197] (2/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:44:49,670 WARNING [train.py:1197] (2/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:44:49,752 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:45:01,283 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=773760.0, ans=0.0 2023-09-30 16:45:04,455 WARNING [train.py:1197] (2/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:45:06,035 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:45:09,148 WARNING [train.py:1197] (2/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:09,231 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:45:10,766 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:45:15,576 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:45:22,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:45:24,238 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.959e+02 2.152e+02 2.408e+02 4.470e+02, threshold=4.304e+02, percent-clipped=1.0 2023-09-30 16:45:26,013 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:45:29,062 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:45:29,113 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 16:45:30,569 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:30,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:31,153 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=773893.3333333334, ans=0.0 2023-09-30 16:45:32,322 WARNING [train.py:1197] (2/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:32,357 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:34,786 WARNING [train.py:1197] (2/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:45:34,823 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 16:45:34,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:45:34,834 WARNING [train.py:1197] (2/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:36,660 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=773893.3333333334, ans=0.125 2023-09-30 16:45:39,951 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:45:39,993 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:45:43,206 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:43,426 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=773960.0, ans=0.2 2023-09-30 16:45:47,535 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:45:47,563 WARNING [train.py:1197] (2/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:45:49,142 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 16:45:52,099 WARNING [train.py:1197] (2/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 16:45:52,108 WARNING [train.py:1197] (2/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 16:45:55,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 16:45:58,083 INFO [train.py:1039] (2/4) Epoch 22, batch 4550, loss[loss=0.1766, simple_loss=0.2628, pruned_loss=0.04522, over 24695.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2493, pruned_loss=0.04804, over 4735338.68 frames. ], batch size: 73, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:45:58,237 WARNING [train.py:1197] (2/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 16:45:59,694 WARNING [train.py:1197] (2/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:02,778 WARNING [train.py:1197] (2/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:04,158 WARNING [train.py:1197] (2/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:07,788 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:08,152 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=774026.6666666666, ans=0.125 2023-09-30 16:46:14,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:46:16,149 WARNING [train.py:1197] (2/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:46:19,116 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:19,120 WARNING [train.py:1197] (2/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:46:19,121 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:20,765 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:20,827 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:24,093 WARNING [train.py:1197] (2/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:46:28,233 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 16:46:28,344 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 16:46:28,453 WARNING [train.py:1197] (2/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:46:29,921 WARNING [train.py:1197] (2/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 16:46:34,338 WARNING [train.py:1197] (2/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 16:46:34,446 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:34,775 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=774160.0, ans=0.2 2023-09-30 16:46:37,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 16:46:39,197 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:46:41,570 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,614 WARNING [train.py:1197] (2/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,634 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:46:44,776 WARNING [train.py:1197] (2/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 16:46:48,312 WARNING [train.py:1197] (2/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:46:51,228 WARNING [train.py:1197] (2/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:51,251 WARNING [train.py:1197] (2/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:51,612 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=774226.6666666666, ans=0.1 2023-09-30 16:46:52,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:54,727 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 16:46:54,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 16:46:54,871 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:46:56,426 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 16:46:56,694 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 16:46:58,873 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:47:00,415 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:00,442 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:01,836 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:01,866 WARNING [train.py:1197] (2/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:47:03,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:47:04,787 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 16:47:04,982 WARNING [train.py:1197] (2/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:47:04,997 WARNING [train.py:1197] (2/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:47:06,558 WARNING [train.py:1197] (2/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 16:47:06,569 WARNING [train.py:1197] (2/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:47:06,595 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 16:47:11,170 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:47:11,189 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:47:13,319 WARNING [train.py:1197] (2/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:47:14,840 WARNING [train.py:1197] (2/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:14,891 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:47:16,463 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:47:18,054 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:47:21,076 INFO [train.py:1039] (2/4) Epoch 22, batch 4600, loss[loss=0.153, simple_loss=0.1978, pruned_loss=0.05413, over 19019.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2477, pruned_loss=0.04828, over 4705229.21 frames. ], batch size: 388, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:47:21,131 WARNING [train.py:1197] (2/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:21,269 WARNING [train.py:1197] (2/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:24,963 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:47:24,983 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:47:25,753 WARNING [train.py:1197] (2/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:27,225 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 16:47:30,067 WARNING [train.py:1197] (2/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:47:35,279 WARNING [train.py:1197] (2/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:47:36,861 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:41,290 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:44,975 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=774426.6666666666, ans=0.125 2023-09-30 16:47:48,441 WARNING [train.py:1197] (2/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 16:47:49,933 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:54,395 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:57,407 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:47:57,421 WARNING [train.py:1197] (2/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:01,056 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=774493.3333333334, ans=0.125 2023-09-30 16:48:02,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 16:48:02,352 WARNING [train.py:1197] (2/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:48:04,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:11,424 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.811e+02 1.994e+02 2.205e+02 2.930e+02, threshold=3.988e+02, percent-clipped=0.0 2023-09-30 16:48:11,547 WARNING [train.py:1197] (2/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:11,630 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:48:13,106 WARNING [train.py:1197] (2/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:48:13,516 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=774560.0, ans=0.1 2023-09-30 16:48:16,386 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 16:48:17,205 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.15 vs. limit=15.0 2023-09-30 16:48:19,268 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:48:24,568 WARNING [train.py:1197] (2/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:26,128 WARNING [train.py:1197] (2/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:48:29,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:29,226 WARNING [train.py:1197] (2/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 16:48:29,283 WARNING [train.py:1197] (2/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:29,385 WARNING [train.py:1197] (2/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 16:48:30,730 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:30,803 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:30,984 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:32,481 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:34,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:34,109 WARNING [train.py:1197] (2/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 16:48:34,175 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 16:48:35,579 WARNING [train.py:1197] (2/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 16:48:35,589 WARNING [train.py:1197] (2/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:37,493 WARNING [train.py:1197] (2/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:37,583 WARNING [train.py:1197] (2/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:39,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:39,823 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=774626.6666666666, ans=0.2 2023-09-30 16:48:44,934 INFO [train.py:1039] (2/4) Epoch 22, batch 4650, loss[loss=0.1517, simple_loss=0.2349, pruned_loss=0.03421, over 24674.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.247, pruned_loss=0.04781, over 4700830.41 frames. ], batch size: 65, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:48:49,810 WARNING [train.py:1197] (2/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:48:51,433 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:51,628 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=774693.3333333334, ans=0.0 2023-09-30 16:48:52,918 WARNING [train.py:1197] (2/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:52,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:48:53,047 WARNING [train.py:1197] (2/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:53,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:56,401 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:59,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 16:49:04,005 WARNING [train.py:1197] (2/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:49:06,889 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 16:49:06,926 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:49:08,469 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 16:49:08,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:49:08,606 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 16:49:08,643 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 16:49:08,656 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:08,820 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=774760.0, ans=0.0 2023-09-30 16:49:10,006 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:49:13,532 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:49:14,995 WARNING [train.py:1197] (2/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:15,038 WARNING [train.py:1197] (2/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 16:49:18,754 WARNING [train.py:1197] (2/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:22,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 16:49:23,784 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:23,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:49:25,340 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 16:49:26,882 WARNING [train.py:1197] (2/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:49:29,885 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:49:33,506 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:49:38,208 WARNING [train.py:1197] (2/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:41,398 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:42,711 WARNING [train.py:1197] (2/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:42,782 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:49:45,952 WARNING [train.py:1197] (2/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 16:49:46,014 WARNING [train.py:1197] (2/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 16:49:46,798 WARNING [train.py:1197] (2/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 16:49:46,801 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 16:49:48,316 WARNING [train.py:1197] (2/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:49:56,956 WARNING [train.py:1197] (2/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:49:56,965 WARNING [train.py:1197] (2/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:49:58,379 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 16:49:58,416 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:49:58,880 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=774960.0, ans=0.1 2023-09-30 16:49:59,980 WARNING [train.py:1197] (2/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:49:59,999 WARNING [train.py:1197] (2/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:50:01,650 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:50:04,792 WARNING [train.py:1197] (2/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:50:04,814 WARNING [train.py:1197] (2/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:50:06,782 INFO [train.py:1039] (2/4) Epoch 22, batch 4700, loss[loss=0.1644, simple_loss=0.2511, pruned_loss=0.03884, over 24602.00 frames. ], tot_loss[loss=0.172, simple_loss=0.248, pruned_loss=0.04796, over 4713243.19 frames. ], batch size: 68, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:50:06,893 WARNING [train.py:1197] (2/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:50:10,220 WARNING [train.py:1197] (2/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:10,289 WARNING [train.py:1197] (2/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:50:10,303 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:50:11,875 WARNING [train.py:1197] (2/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 16:50:12,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:50:13,553 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 16:50:21,990 WARNING [train.py:1197] (2/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:23,459 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:23,528 WARNING [train.py:1197] (2/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:50:24,219 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.54 vs. limit=15.0 2023-09-30 16:50:25,034 WARNING [train.py:1197] (2/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:25,292 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=775093.3333333334, ans=0.0 2023-09-30 16:50:25,378 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=775093.3333333334, ans=0.125 2023-09-30 16:50:26,561 WARNING [train.py:1197] (2/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:50:32,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 16:50:32,066 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 16:50:35,146 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:36,763 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:50:36,813 WARNING [train.py:1197] (2/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:50:42,016 WARNING [train.py:1197] (2/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:42,291 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=775160.0, ans=0.1 2023-09-30 16:50:48,246 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:50:49,012 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.48 vs. limit=15.0 2023-09-30 16:50:49,824 WARNING [train.py:1197] (2/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:50:51,364 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:55,712 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.872e+02 2.138e+02 2.585e+02 4.153e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 16:50:58,082 WARNING [train.py:1197] (2/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 16:50:58,246 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:51:01,214 WARNING [train.py:1197] (2/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:05,704 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=775226.6666666666, ans=0.0 2023-09-30 16:51:06,796 WARNING [train.py:1197] (2/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 16:51:08,447 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:51:10,221 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=775226.6666666666, ans=0.125 2023-09-30 16:51:11,617 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:51:13,081 WARNING [train.py:1197] (2/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 16:51:14,705 WARNING [train.py:1197] (2/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:14,730 WARNING [train.py:1197] (2/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:14,984 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=775293.3333333334, ans=0.0 2023-09-30 16:51:18,550 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=775293.3333333334, ans=0.0 2023-09-30 16:51:19,902 WARNING [train.py:1197] (2/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:51:19,973 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:51:20,012 WARNING [train.py:1197] (2/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 16:51:21,634 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 16:51:23,231 WARNING [train.py:1197] (2/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:26,263 WARNING [train.py:1197] (2/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,264 WARNING [train.py:1197] (2/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,270 WARNING [train.py:1197] (2/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 16:51:26,415 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:27,075 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.03 vs. limit=15.0 2023-09-30 16:51:29,377 INFO [train.py:1039] (2/4) Epoch 22, batch 4750, loss[loss=0.1758, simple_loss=0.2588, pruned_loss=0.04642, over 24651.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2485, pruned_loss=0.04781, over 4722718.19 frames. ], batch size: 65, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:51:31,114 WARNING [train.py:1197] (2/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 16:51:31,921 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.80 vs. limit=15.0 2023-09-30 16:51:34,879 WARNING [train.py:1197] (2/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:51:36,489 WARNING [train.py:1197] (2/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:40,494 WARNING [train.py:1197] (2/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:40,531 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:51:43,020 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 16:51:43,076 WARNING [train.py:1197] (2/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:51:47,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 16:51:47,696 WARNING [train.py:1197] (2/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:51:47,732 WARNING [train.py:1197] (2/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:49,176 WARNING [train.py:1197] (2/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:51:56,084 WARNING [train.py:1197] (2/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 16:51:59,384 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:52:01,064 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=775426.6666666666, ans=0.125 2023-09-30 16:52:02,369 WARNING [train.py:1197] (2/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 16:52:02,484 WARNING [train.py:1197] (2/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:03,007 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.01 vs. limit=22.5 2023-09-30 16:52:06,261 WARNING [train.py:1197] (2/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:06,265 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:06,296 WARNING [train.py:1197] (2/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:09,020 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 16:52:09,025 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 16:52:12,830 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 16:52:15,244 WARNING [train.py:1197] (2/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:15,502 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=775493.3333333334, ans=0.0 2023-09-30 16:52:16,999 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=775493.3333333334, ans=0.2 2023-09-30 16:52:18,180 WARNING [train.py:1197] (2/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:19,823 WARNING [train.py:1197] (2/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:52:19,824 WARNING [train.py:1197] (2/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 16:52:19,831 WARNING [train.py:1197] (2/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:20,155 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=775560.0, ans=0.2 2023-09-30 16:52:23,019 WARNING [train.py:1197] (2/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:52:28,200 WARNING [train.py:1197] (2/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:52:31,236 WARNING [train.py:1197] (2/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 16:52:31,295 WARNING [train.py:1197] (2/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 16:52:31,603 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=775560.0, ans=0.125 2023-09-30 16:52:32,783 WARNING [train.py:1197] (2/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:32,835 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:52:34,396 WARNING [train.py:1197] (2/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:34,521 WARNING [train.py:1197] (2/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:52:34,551 WARNING [train.py:1197] (2/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 16:52:37,565 WARNING [train.py:1197] (2/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 16:52:37,932 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=775626.6666666666, ans=0.0 2023-09-30 16:52:40,772 WARNING [train.py:1197] (2/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:52:42,473 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:52:42,476 WARNING [train.py:1197] (2/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 16:52:42,538 WARNING [train.py:1197] (2/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:44,610 WARNING [train.py:1197] (2/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:46,222 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:52:46,323 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:46,655 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=775626.6666666666, ans=0.125 2023-09-30 16:52:47,739 WARNING [train.py:1197] (2/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:52:51,511 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:52:52,855 WARNING [train.py:1197] (2/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 16:52:52,985 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 16:52:53,171 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=775693.3333333334, ans=0.125 2023-09-30 16:52:54,292 INFO [train.py:1039] (2/4) Epoch 22, batch 4800, loss[loss=0.1775, simple_loss=0.2554, pruned_loss=0.04983, over 23243.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2499, pruned_loss=0.04848, over 4721935.96 frames. ], batch size: 93, lr: 4.63e-03, grad_scale: 32.0 2023-09-30 16:52:54,528 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 16:52:57,671 WARNING [train.py:1197] (2/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:52:59,129 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:00,672 WARNING [train.py:1197] (2/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 16:53:05,967 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:06,039 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:10,235 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.54 vs. limit=12.0 2023-09-30 16:53:10,894 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:53:12,438 WARNING [train.py:1197] (2/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:12,481 WARNING [train.py:1197] (2/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:12,574 WARNING [train.py:1197] (2/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 16:53:14,069 WARNING [train.py:1197] (2/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:53:14,871 WARNING [train.py:1197] (2/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:53:16,503 WARNING [train.py:1197] (2/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:53:22,808 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:25,097 WARNING [train.py:1197] (2/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:25,162 WARNING [train.py:1197] (2/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:53:26,765 WARNING [train.py:1197] (2/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:26,792 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:53:26,816 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:28,388 WARNING [train.py:1197] (2/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:30,141 WARNING [train.py:1197] (2/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:33,057 WARNING [train.py:1197] (2/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:34,884 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=775826.6666666666, ans=0.0 2023-09-30 16:53:36,847 WARNING [train.py:1197] (2/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:36,878 WARNING [train.py:1197] (2/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:53:38,417 WARNING [train.py:1197] (2/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:53:41,337 WARNING [train.py:1197] (2/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:41,536 WARNING [train.py:1197] (2/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 16:53:43,013 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 16:53:43,131 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:43,163 WARNING [train.py:1197] (2/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:53:43,930 INFO [scaling.py:1022] (2/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=12.0 2023-09-30 16:53:44,486 INFO [optim.py:468] (2/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.908e+02 2.098e+02 2.406e+02 3.815e+02, threshold=4.197e+02, percent-clipped=0.0 2023-09-30 16:53:44,662 WARNING [train.py:1197] (2/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:53:44,673 WARNING [train.py:1197] (2/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:44,687 WARNING [train.py:1197] (2/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:53:46,287 WARNING [train.py:1197] (2/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:53:46,376 WARNING [train.py:1197] (2/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:51,107 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:53,837 INFO [scaling.py:213] (2/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=775893.3333333334, ans=0.05 2023-09-30 16:53:54,895 WARNING [train.py:1197] (2/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:55,085 WARNING [train.py:1197] (2/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725